Semiconducting bilinear deep learning for incomplete image recognition

Image recognition with incomplete data is a well-known hard problem in multimedia content analysis. This paper proposes a novel deep learning technique called semiconducting bilinear deep belief networks (SBDBN) by referencing human's visual cortex and intelligent perception. Inheriting from deep models, SBDBN simulates the laminar structure of human's cerebral cortex and the neural loop in human's visual areas. To address the special difficulties of image recognition with incomplete data, we design a novel second-order deep architecture with semiconducting restricted boltzmann machines. Moreover, two peaks activation of human's perception is implemented by three learning stages of semiconducting bilinear discriminant initialization, greedy layer-wise reconstruction, and global fine-tuning. Owing to exploiting the embedding information according to the reliable features rather than any completion of missing features, the proposed SBDBN has demonstrated outstanding recognition ability on two standard datasets and one constructed dataset, comparing with both incomplete image recognition techniques and existing deep learning models.

[1]  Pieter Abbeel,et al.  Max-margin Classification of Data with Absent Features , 2008, J. Mach. Learn. Res..

[2]  Klaus J. Kirchberg,et al.  Robust Face Detection Using the Hausdorff Distance , 2001, AVBPA.

[3]  Rainer Lienhart,et al.  Deep networks for image retrieval on large-scale databases , 2008, ACM Multimedia.

[4]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[5]  Hui Li,et al.  Quadratically gated mixture of experts for incomplete data classification , 2007, ICML '07.

[6]  Jason Weston,et al.  Deep learning via semi-supervised embedding , 2008, ICML '08.

[7]  Alexander J. Smola,et al.  Second Order Cone Programming Approaches for Handling Missing and Uncertain Data , 2006, J. Mach. Learn. Res..

[8]  Edward Y. Chang,et al.  A deep-learning model-based and data-driven hybrid architecture for image annotation , 2010, VLS-MCMR '10.

[9]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[10]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[11]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[12]  Lawrence Carin,et al.  On Classification with Incomplete Data , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Peter Haider,et al.  Learning from incomplete data with infinite imputations , 2008, ICML '08.

[14]  Xilin Chen,et al.  Attention driven face recognition: A combination of spatial variant fixations and glance , 2011, Face and Gesture 2011.

[15]  Geoffrey E. Hinton,et al.  Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure , 2007, AISTATS.

[16]  Y. Liu,et al.  Bilinear deep learning for image classification , 2011, ACM Multimedia.

[17]  B. Schölkopf,et al.  Max-margin classification of incomplete data , 2007 .

[18]  Lawrence Carin,et al.  Incomplete-data classification using logistic regression , 2005, ICML.

[19]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.