3D Object Recognition Using Convolutional Neural Networks with Transfer Learning Between Input Channels

RGB-D data is getting ever more interest from the research community as both cheap cameras appear in the market and the applications of this type of data become more common. A current trend in processing image data is the use of convolutional neural networks (CNNs) that have consistently beat competition in most benchmark data sets. In this paper, we investigate the possibility of transferring knowledge between CNNs when processing RGB-D data with the goal of both improving accuracy and reducing training time. We present experiments that show that our proposed approach can achieve both these goals.

[1]  Luís A. Alexandre 3D Descriptors for Object and Category Recognition: a Comparative Evaluation , 2012 .

[2]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[3]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[4]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Luís A. Alexandre,et al.  Improving Performance on Problems with Few Labelled Data by Reusing Stacked Auto-Encoders , 2014, 2014 13th International Conference on Machine Learning and Applications.

[6]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[7]  Nicolas Le Roux,et al.  Deep Belief Networks Are Compact Universal Approximators , 2010, Neural Computation.

[8]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[9]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[10]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[11]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[12]  Barbara Caputo,et al.  Multi-modal Semantic Place Classification , 2010, Int. J. Robotics Res..

[13]  Jürgen Schmidhuber,et al.  Transfer learning for Latin and Chinese characters with Deep Neural Networks , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[14]  Andrew Y. Ng,et al.  Convolutional-Recursive Deep Learning for 3D Object Classification , 2012, NIPS.

[15]  Sílvio Filipe,et al.  RETRACTED ARTICLE: From the human visual system to the computational models of visual attention: a survey , 2015, Artificial Intelligence Review.