Biologically Inspired Visual Model With Preliminary Cognition and Active Attention Adjustment

Recently, many computational models have been proposed to simulate visual cognition process. For example, the hierarchical Max-Pooling (HMAX) model was proposed according to the hierarchical and bottom-up structure of V1 to V4 in the ventral pathway of primate visual cortex, which could achieve position- and scale-tolerant recognition. In our previous work, we have introduced memory and association into the HMAX model to simulate visual cognition process. In this paper, we improve our theoretical framework by mimicking a more elaborate structure and function of the primate visual cortex. We will mainly focus on the new formation of memory and association in visual processing under different circumstances as well as preliminary cognition and active adjustment in the inferior temporal cortex, which are absent in the HMAX model. The main contributions of this paper are: 1) in the memory and association part, we apply deep convolutional neural networks to extract various episodic features of the objects since people use different features for object recognition. Moreover, to achieve a fast and robust recognition in the retrieval and association process, different types of features are stored in separated clusters and the feature binding of the same object is stimulated in a loop discharge manner and 2) in the preliminary cognition and active adjustment part, we introduce preliminary cognition to classify different types of objects since distinct neural circuits in a human brain are used for identification of various types of objects. Furthermore, active cognition adjustment of occlusion and orientation is implemented to the model to mimic the top-down effect in human cognition process. Finally, our model is evaluated on two face databases CAS-PEAL-R1 and AR. The results demonstrate that our model exhibits its efficiency on visual recognition process with much lower memory storage requirement and a better performance compared with the traditional purely computational methods.

[1]  Wen Gao,et al.  The CAS-PEAL Large-Scale Chinese Face Database and Baseline Evaluations , 2008, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[2]  P. Fldik,et al.  The Speed of Sight , 2001, Journal of Cognitive Neuroscience.

[3]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[4]  Fred Nicolls,et al.  Locating Facial Features with an Extended Active Shape Model , 2008, ECCV.

[5]  George K. I. Mann,et al.  An Object-Based Visual Attention Model for Robotic Applications , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[6]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[7]  Qi Wang,et al.  NATAS: Neural Activity Trace Aware Saliency , 2014, IEEE Transactions on Cybernetics.

[8]  Hong Qiao,et al.  Improving invariance in visual classification with biologically inspired mechanism , 2014, Neurocomputing.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  T. Poggio,et al.  Neural mechanisms of object recognition , 2002, Current Opinion in Neurobiology.

[11]  Ali Borji,et al.  Invariance analysis of modified C2 features: case study—handwritten digit recognition , 2009, Machine Vision and Applications.

[12]  Joel Z. Leibo,et al.  Learning invariant representations and applications to face verification , 2013, NIPS.

[13]  N. Kanwisher,et al.  PSYCHOLOGICAL SCIENCE Research Article Visual Recognition As Soon as You Know It Is There, You Know What It Is , 2022 .

[14]  A. Martínez,et al.  The AR face databasae , 1998 .

[15]  Qi Wang,et al.  Tag-Saliency: Combining bottom-up and top-down information for saliency detection , 2014, Comput. Vis. Image Underst..

[16]  Fakhri Karray,et al.  A Probabilistic Model of Overt Visual Attention for Cognitive Robots , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[17]  M. Eimer,et al.  Face learning and the emergence of view-independent face recognition: An event-related brain potential study , 2013, Neuropsychologia.

[18]  Aaron C. Courville,et al.  Understanding Representations Learned in Deep Architectures , 2010 .

[19]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[20]  Robert Desimone,et al.  Cortical Connections of Area V4 in the Macaque , 2008 .

[21]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[22]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Eleanor Rosch,et al.  Principles of Categorization , 1978 .

[24]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[25]  Xuelong Li,et al.  Biologically Inspired Features for Scene Classification in Video Surveillance , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[26]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  E. Tulving,et al.  Episodic and declarative memory: Role of the hippocampus , 1998, Hippocampus.

[28]  Honglak Lee,et al.  Unsupervised learning of hierarchical representations with convolutional deep belief networks , 2011, Commun. ACM.

[29]  Christof Koch,et al.  Attentional Selection for Object Recognition - A Gentle Way , 2002, Biologically Motivated Computer Vision.

[30]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[31]  R. Desimone,et al.  Responses of Neurons in Inferior Temporal Cortex during Memory- Guided Visual Search , 1998 .

[32]  Xuelong Li,et al.  Enhanced Biologically Inspired Model for Object Recognition , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[33]  D. Maurer,et al.  Configural Face Processing Develops more Slowly than Featural Face Processing , 2002, Perception.

[34]  Hong Qiao,et al.  Introducing Memory and Association Mechanism Into a Biologically Inspired Visual Model , 2014, IEEE Transactions on Cybernetics.

[35]  R Vogels,et al.  Coding of stimulus invariances by inferior temporal neurons. , 1996, Progress in brain research.

[36]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[37]  Carrie J. McAdams,et al.  Effects of Attention on Orientation-Tuning Functions of Single Neurons in Macaque Cortical Area V4 , 1999, The Journal of Neuroscience.

[38]  Keiji Tanaka,et al.  Inferotemporal cortex and object vision. , 1996, Annual review of neuroscience.

[39]  Ammad Ali,et al.  Face Recognition with Local Binary Patterns , 2012 .

[40]  Pingkun Yan,et al.  Learning Saliency by MRF and Differential Threshold , 2013, IEEE Transactions on Cybernetics.

[41]  R. Desimone,et al.  A backward progression of attentional effects in the ventral stream , 2009, Proceedings of the National Academy of Sciences.

[42]  David L. Sheinberg,et al.  Visual object recognition. , 1996, Annual review of neuroscience.

[43]  Rob Jenkins,et al.  Unfamiliar Face Perception , 2011 .

[44]  C. Bruce,et al.  Topography of projections to posterior cortical areas from the macaque frontal eye fields , 1995, The Journal of comparative neurology.

[45]  Louise S. Delicato,et al.  Acetylcholine contributes through muscarinic receptors to attentional modulation in V1 , 2008, Nature.

[46]  Minami Ito,et al.  Size and position invariance of neuronal responses in monkey inferotemporal cortex. , 1995, Journal of neurophysiology.

[47]  David G. Lowe,et al.  Multiclass Object Recognition with Sparse, Localized Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[48]  Lior Wolf,et al.  Using Biologically Inspired Features for Face Processing , 2007, International Journal of Computer Vision.

[49]  M. Tovée,et al.  Translation invariance in the responses to faces of single neurons in the temporal visual cortical areas of the alert macaque. , 1994, Journal of neurophysiology.

[50]  C. Connor,et al.  Shape representation in area V4: position-specific tuning for boundary conformation. , 2001, Journal of neurophysiology.

[51]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .