论文信息 - Spatial Hierarchical Analysis Deep Neural Network for RGB-D Object Recognition

Spatial Hierarchical Analysis Deep Neural Network for RGB-D Object Recognition

Deep learning based object recognition methods have achieved unprecedented success in the recent years. However, this level of success is yet to be achieved on multimodal RGB-D images. The latter can play an important role in several computer vision and robotics applications. In this paper, we present spatial hierarchical analysis deep neural network, called ShaNet, for RGB-D object recognition. Our network consists of convolutional neural network (CNN) and recurrent neural network (RNNs) to analyse and learn distinctive and translationally invariant features in a hierarchical fashion. Unlike existing methods, which employ pre-trained models or rely on transfer learning, our proposed network is trained from scratch on RGB-D data. The proposed model has been tested on two different publicly available RGB-D datasets including Washington RGB-D and 2D3D object dataset. Our experimental results show that the proposed deep neural network achieves superior performance compared to existing RGB-D object recognition methods.

Syed Afaq Ali Shah

[1] Mohammed Bennamoun,et al. Efficient Image Set Classification Using Linear Regression Based Image Reconstruction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[2] Anton van den Hengel,et al. The treasure beneath convolutional layers: Cross-convolutional-layer pooling for image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Mohammed Bennamoun,et al. Machine Learning Approaches for Prediction of Facial Rejuvenation Using Real and Synthetic Data , 2019, IEEE Access.

[4] Dieter Fox,et al. A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[5] Mohammed Bennamoun,et al. A Fully Automatic Framework for Prediction of 3D Facial Rejuvenation , 2018, 2018 International Conference on Image and Vision Computing New Zealand (IVCNZ).

[6] Sven Behnke,et al. RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[7] Mohammed Bennamoun,et al. 2D and 3D face recognition using convolutional neural network , 2017, TENCON 2017 - 2017 IEEE Region 10 Conference.

[8] Jitendra Malik,et al. Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[9] Mohammed Bennamoun,et al. 3D-Div: A novel local surface descriptor for feature matching and pairwise range image registration , 2013, 2013 IEEE International Conference on Image Processing.

[10] D. T. Lee,et al. Unsupervised Feature Learning for RGB-D Image Classification , 2014, ACCV.

[11] Dieter Fox,et al. Unsupervised Feature Learning for RGB-D Based Object Recognition , 2012, ISER.

[12] Heinrich H. Bülthoff,et al. Going into depth: Evaluating 2D and 3D cues for object classification on a new, large-scale object dataset , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[13] Mohammed Bennamoun,et al. Efficient RGB-D object categorization using cascaded ensembles of randomized decision trees , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[14] Rongrong Ji,et al. Towards 3D object detection with bimodal deep Boltzmann machines over RGBD imagery , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Andrew Zisserman,et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[16] Mohammed Bennamoun,et al. A novel 3D vorticity based approach for automatic registration of low resolution range images , 2015, Pattern Recognit..

[17] Ajmal S. Mian,et al. Localized Deep Extreme Learning Machines for Efficient RGB-D Object Recognition , 2015, 2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[18] Mohammed Bennamoun,et al. A novel feature representation for automatic 3D object recognition in cluttered scenes , 2016, Neurocomputing.

[19] Mohammed Bennamoun,et al. Iterative deep learning for image set based face and object recognition , 2016, Neurocomputing.

[20] Andrew E. Johnson,et al. Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[21] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[22] Mohammed Bennamoun,et al. A Guide to Convolutional Neural Networks for Computer Vision , 2018, A Guide to Convolutional Neural Networks for Computer Vision.

[23] Fuqiang Chen,et al. Subset based deep learning for RGB-D object recognition , 2015, Neurocomputing.

[24] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25] Dieter Fox,et al. Depth kernel descriptors for object recognition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[27] Mohammed Bennamoun,et al. Keypoints-based surface representation for 3D modeling and 3D object recognition , 2017, Pattern Recognit..

[28] Tieniu Tan,et al. Semi-supervised Learning for RGB-D Object Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[29] Honglak Lee,et al. Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[30] M. Bennamoun,et al. Automatic object detection using objectness measure , 2013, 2013 1st International Conference on Communications, Signal Processing, and their Applications (ICCSPA).

[31] Mohammed Bennamoun,et al. A Novel Local Surface Description for Automatic 3D Object Recognition in Low Resolution Cluttered Scenes , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[32] Mohammed Bennamoun,et al. Real time surveillance for low resolution and limited data scenarios: An image set classification approach , 2018, Inf. Sci..

[33] Mohammed Bennamoun,et al. Evolutionary Feature Learning for 3-D Object Recognition , 2018, IEEE Access.

[34] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[35] Mohammed Bennamoun,et al. Performance Evaluation of 3D Local Surface Descriptors for Low and High Resolution Range Image Registration , 2014, 2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[36] Martin A. Riedmiller,et al. A learned feature descriptor for object recognition in RGB-D data , 2012, 2012 IEEE International Conference on Robotics and Automation.

[37] Stefan Carlsson,et al. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[38] Jitendra Malik,et al. Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).