Cooperative Object Classification for Driving Applications

3D object classification can be realised by rendering views of the same object from different angles and aggregating all the views to build a classifier. Although this approach has been previously proposed for general objects classification, most existing works did not consider visual impairments. In contrast, this paper considers the problem of 3D object classification for driving applications under impairments (e.g. occlusion and sensor noise) by generating an application-specific dataset. We present a cooperative object classification method where multiple images of the same object seen from different perspectives (agents) are exploited to generate more accurate classification. We consider model generalisation capability and its resilience to impairments. We introduce an occlusion model with higher resemblance to real-world occlusion and use a simplified sensor noise model. The experimental results show that the cooperative model, relying on multiple views, significantly outperforms single-view methods and is effective in mitigating the effects of occlusion and sensor noise.

[1]  Leonidas J. Guibas,et al.  ObjectNet3D: A Large Scale Database for 3D Object Recognition , 2016, ECCV.

[2]  Yasuyuki Matsushita,et al.  RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[4]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[5]  Özgür Yilmaz,et al.  Classification of Occluded Objects Using Fast Recurrent Processing , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[6]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[7]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[9]  Bernard Chazelle,et al.  Shape distributions , 2002, TOGS.

[10]  Szymon Rusinkiewicz,et al.  Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors , 2003, Symposium on Geometry Processing.

[11]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[12]  Ennio Mingolla,et al.  Mitigation of Effects of Occlusion on Object Recognition with Deep Neural Networks through Low-Level Image Completion , 2016, Comput. Intell. Neurosci..

[13]  Qingyun Du,et al.  A Mobile Outdoor Augmented Reality Method Combining Deep Learning Object Detection and Spatial Relationships for Geovisualization , 2017, Sensors.

[14]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Homayoun Najjaran,et al.  3D object classification with point convolution network , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[16]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[18]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[19]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Keigo Hirakawa,et al.  Approximations to camera sensor noise , 2013, Electronic Imaging.

[21]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  James J. Little,et al.  Explicit Occlusion Reasoning for 3D Object Detection , 2011, BMVC.

[23]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[24]  Sebastian Thrun,et al.  Towards 3D object recognition via classification of arbitrary object tracks , 2011, 2011 IEEE International Conference on Robotics and Automation.

[25]  Berthold K. P. Horn Extended Gaussian images , 1984, Proceedings of the IEEE.