Viewpoint Estimation for Workpieces with Deep Transfer Learning from Cold to Hot

With the revival of deep neural networks, viewpoint estimation problem can be handled by the learned distinctive features. However, the scarcity and expensiveness of viewpoint annotation for the real-world industrial workpieces impede its progress of application. In this paper, we propose a deep transfer learning method for viewpoint estimation by transferring priori knowledge from labeled synthetic images to unlabeled real images. The synthetic images are rendered from 3D Computer-Aided Design (CAD) models and annotated automatically. To boost the performance of deep transfer network, we design a new two-stage training strategy called cold-to-hot training. At the cold start stage, deep networks are trained for the joint tasks of classification and knowledge transfer in the absence of labels of real images. But after it turns into the hot stage, the pseudo labels of real images are employed for controlling the distributions of input data. The satisfactory experimental results demonstrate the effectiveness of the proposed method in dealing with the viewpoint estimation problem under the scarcity of annotated real workpiece images.

[1]  Ming Shao,et al.  Circle detection by arc-support line segments , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[2]  Tao Chen,et al.  Robust Vehicle Detection and Viewpoint Estimation With Soft Discriminative Mixture Model , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Jitendra Malik,et al.  Recognizing Objects in Range Data Using Regional Point Descriptors , 2004, ECCV.

[4]  Mengjie Zhang,et al.  Domain Adaptive Neural Networks for Object Recognition , 2014, PRICAI.

[5]  Federico Tombari,et al.  Unique Signatures of Histograms for Local Surface Description , 2010, ECCV.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[8]  Vincent Lepetit,et al.  Feature Mapping for Learning Fast and Accurate 3D Pose Inference from Synthetic Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Nima Tajbakhsh,et al.  Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? , 2016, IEEE Transactions on Medical Imaging.

[11]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Michael I. Jordan,et al.  Deep Transfer Learning with Joint Adaptation Networks , 2016, ICML.

[13]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[14]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[16]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[17]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  Varun Jampani,et al.  Training Deep Networks with Synthetic Data: Bridging the Reality Gap by Domain Randomization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  Hans-Peter Kriegel,et al.  Integrating structured biological data by Kernel Maximum Mean Discrepancy , 2006, ISMB.

[22]  Silvio Savarese,et al.  3D Semantic Parsing of Large-Scale Indoor Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Leonidas J. Guibas,et al.  Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Wei Liang,et al.  Viewpoint Estimation for Objects with Convolutional Neural Network Trained on Synthetic Images , 2016, PCM.

[25]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[26]  Kate Saenko,et al.  Exploring Invariances in Deep Convolutional Neural Networks Using Synthetic Images , 2014, ArXiv.

[27]  V. F. F. Leavers Shape Detection in Computer Vision Using the Hough Transform , 2011 .