Fast Inference Predictive Coding: A Novel Model for Constructing Deep Neural Networks

As a biomimetic model of visual information processing, predictive coding (PC) has become increasingly popular for explaining a range of neural responses and many aspects of brain organization. While the development of PC model is encouraging in the neurobiology community, its practical applications in machine learning (e.g., image classification) have not been fully explored yet. In this paper, a novel image processing model called fast inference PC (FIPC) is presented for image representation and classification. Compared with the basic PC model, a regression procedure and a classification layer have been added to the proposed FIPC model. The regression procedure is used to learn regression mappings that achieve fast inference at test time, while the classification layer can instruct the model to extract more discriminative features. In addition, effective learning and fine-tuning algorithms are developed for the proposed model. Experimental results obtained on four image benchmark data sets show that our model is able to directly and fast infer representations and, simultaneously, produce lower error rates on image classification tasks.

[1]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Bruno A Olshausen,et al.  Sparse coding of sensory inputs , 2004, Current Opinion in Neurobiology.

[3]  A. Borst Seeing smells: imaging olfactory learning in bees , 1999, Nature Neuroscience.

[4]  Michael W. Spratling A review of predictive coding algorithms , 2017, Brain and Cognition.

[5]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[6]  Guangming Shi,et al.  Bottom–Up Visual Saliency Estimation With Deep Autoencoder-Based Sparse Reconstruction , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[8]  Jonathan Tompson,et al.  Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.

[9]  Michael W. Spratling Predictive Coding as a Model of Response Properties in Cortical Area V1 , 2010, The Journal of Neuroscience.

[10]  Karl J. Friston,et al.  Repetition suppression and its contextual determinants in predictive coding , 2016, Cortex.

[11]  Wei Xu,et al.  Explain Images with Multimodal Recurrent Neural Networks , 2014, ArXiv.

[12]  Marc'Aurelio Ranzato,et al.  Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition , 2010, ArXiv.

[13]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[14]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[15]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[16]  Jason Weston,et al.  Question Answering with Subgraph Embeddings , 2014, EMNLP.

[17]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[18]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[19]  D. J. Felleman,et al.  Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[20]  Alex Graves,et al.  Decoupled Neural Interfaces using Synthetic Gradients , 2016, ICML.

[21]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[22]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[23]  Karl J. Friston,et al.  Action and behavior: a free-energy formulation , 2010, Biological Cybernetics.

[24]  Michael W. Spratling Unsupervised Learning of Generative and Discriminative Weights Encoding Elementary Image Components in a Predictive Coding Model of Cortical Function , 2012, Neural Computation.

[25]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[26]  Karl J. Friston,et al.  Predictive Coding or Evidence Accumulation? False Inference and Neuronal Fluctuations , 2010, PloS one.

[27]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[28]  Wei Xu,et al.  Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question , 2015, NIPS.

[29]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[30]  David J. Field,et al.  What Is the Goal of Sensory Coding? , 1994, Neural Computation.

[31]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[33]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[34]  Karl J. Friston,et al.  A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[35]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[36]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[37]  Michael Elad,et al.  Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Dong Yu,et al.  Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.

[39]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Thomas Serre,et al.  A quantitative theory of immediate visual recognition. , 2007, Progress in brain research.

[41]  Geoffrey E. Hinton,et al.  Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[42]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[44]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[45]  Antonio Torralba,et al.  Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence , 2016, Scientific Reports.

[46]  José Carlos Príncipe,et al.  Context Dependent Encoding Using Convolutional Dynamic Networks , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[47]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[48]  Pascal Vincent,et al.  Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[49]  Karl J. Friston,et al.  Predictive coding under the free-energy principle , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[50]  Joel Z. Leibo,et al.  The dynamics of invariant object recognition in the human visual system. , 2014, Journal of neurophysiology.

[51]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .