Perceptual Feature Extraction

In this chapter,\(^\dagger\) we present a deep model-based and data-driven hybrid architecture (DMD) for feature extraction. First, we construct a deep learning pipeline for progressively learning image features from simple to complex. We mix this deep model-based pipeline with a data-driven pipeline, which extracts features from a large collection of unlabeled images. Sparse regularization is then performed on features extracted from both pipelines in an unsupervised way to obtain representative patches. Upon obtaining these patches, a supervised learning algorithm is employed to conduct object prediction. We present how DMD works and explain why it works more effectively than traditional models from both aspects of neuroscience and computational learning theory.

[1]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[2]  C. Gross Visual Functions of Inferotemporal Cortex , 1973 .

[3]  Tomaso Poggio,et al.  Learning a dictionary of shape-components in visual cortex: comparison with neurons, humans and machines , 2006 .

[4]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[5]  B. S. Manjunath,et al.  A texture descriptor for browsing and similarity retrieval , 2000, Signal Process. Image Commun..

[6]  Juergen Luettin,et al.  Audio-Visual Automatic Speech Recognition: An Overview , 2004 .

[7]  M. Sanders Handbook of Sensory Physiology , 1975 .

[8]  Marc'Aurelio Ranzato,et al.  Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Patrick Haffner,et al.  Support vector machines for histogram-based image classification , 1999, IEEE Trans. Neural Networks.

[10]  E. Y. Chang,et al.  Toward perception-based image retrieval , 2000, 2000 Proceedings Workshop on Content-based Access of Image and Video Libraries.

[11]  H. Komatsu,et al.  Color Selectivity of Neurons in the Posterior Inferior Temporal Cortex of the Macaque Monkey , 2009, Cerebral cortex.

[12]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  T. Gawne,et al.  Responses of primate visual cortical V4 neurons to simultaneously presented stimuli. , 2002, Journal of neurophysiology.

[14]  Thomas Serre,et al.  Realistic Modeling of Simple and Complex Cell Tuning in the HMAX Model, and Implications for Invariant Object Recognition in Cortex , 2004 .

[15]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[16]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[17]  D. V. van Essen,et al.  A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[18]  Jia-Guu Leu Computing a shape's moments from its boundary , 1991, Pattern Recognit..

[19]  E. Miller,et al.  The prefontral cortex and cognitive control , 2000, Nature Reviews Neuroscience.

[20]  Helge Reikeras,et al.  Audio-visual automatic speech recognition using Dynamic Bayesian Networks , 2011 .

[21]  D. B. Bender,et al.  Visual properties of neurons in inferotemporal cortex of the Macaque. , 1972, Journal of neurophysiology.

[22]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[23]  Y. Yamane,et al.  Complex objects are represented in macaque inferotemporal cortex by the combination of feature columns , 2001, Nature Neuroscience.

[24]  Wei-Ying Ma,et al.  Benchmarking of image features for content-based retrieval , 1998, Conference Record of Thirty-Second Asilomar Conference on Signals, Systems and Computers (Cat. No.98CH36284).

[25]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[26]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[27]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Thomas Serre,et al.  Modeling feature sharing between object detection and top-down attention , 2005 .

[29]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[30]  T. Poggio,et al.  A Bayesian inference theory of attention: neuroscience and algorithms , 2009 .

[31]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[32]  Tomaso Poggio,et al.  Intracellular measurements of spatial integration and the MAX operation in complex cells of the cat primary visual cortex. , 2004, Journal of neurophysiology.

[33]  Edward Y. Chang,et al.  A deep-learning model-based and data-driven hybrid architecture for image annotation , 2010, VLS-MCMR '10.

[34]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[35]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[36]  Vapnik,et al.  SVMs for Histogram Based Image Classification , 1999 .

[37]  Hideyuki Tamura,et al.  Textural Features Corresponding to Visual Perception , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[38]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[39]  T. Poggio,et al.  Are Cortical Models Really Bound by the “Binding Problem”? , 1999, Neuron.

[40]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[41]  Thomas Serre,et al.  Object recognition with features inspired by visual cortex , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[42]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.