Automatic “Ground Truth” Annotation and Industrial Workpiece Dataset Generation for Deep Learning

In industry, it is becoming common to detect and recognize industrial workpieces using deep learning methods. In this field, the lack of datasets is a big problem, and collecting and annotating datasets in this field is very labor intensive. The researchers need to perform dataset annotation if a dataset is generated by themselves. It is also one of the restrictive factors that the current method based on deep learning cannot expand well. At present, there are very few workpiece datasets for industrial fields, and the existing datasets are generated from ideal workpiece computer aided design (CAD) models, for which few actual workpiece images were collected and utilized. We propose an automatic industrial workpiece dataset generation method and an automatic ground truth annotation method. Included in our methods are three algorithms that we proposed: a point cloud based spatial plane segmentation algorithm to segment the workpieces in the real scene and to obtain the annotation information of the workpieces in the images captured in the real scene; a random multiple workpiece generation algorithm to generate abundant composition datasets with random rotation workpiece angles and positions; and a tangent vector based contour tracking and completion algorithm to get improved contour images. With our procedures, annotation information can be obtained using the algorithms proposed in this paper. Upon completion of the annotation process, a json format file is generated. Faster R-CNN (Faster R-convolutional neural network), SSD (single shot multibox detector) and YOLO (you only look once: unified, real-time object detection) are trained using the datasets proposed in this paper. The experimental results show the effectiveness and integrity of this dataset generation and annotation method.

[1]  Krista A. Ehinger,et al.  SUN Database: Exploring a Large Collection of Scene Categories , 2014, International Journal of Computer Vision.

[2]  Zongyi Wang,et al.  PolishNet-2d and PolishNet-3d: Deep Learning-Based Workpiece Recognition , 2019, IEEE Access.

[3]  Bojan Lalic,et al.  Deep learning powered automated tool for generating image based datasets , 2017, 2017 IEEE 14th International Scientific Conference on Informatics.

[4]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[5]  Ya-Jun Li,et al.  H∞ State Estimation for Stochastic Markovian Jumping Neural Network with Time-Varying Delay and Leakage Delay , 2016, International Journal of Automation and Computing.

[6]  Ciro Potena,et al.  Automatic model based dataset generation for fast and accurate crop and weeds detection , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[7]  Xinying Xu,et al.  Deep Learning Based Single Image Super-resolution: A Survey , 2018, International Journal of Automation and Computing.

[8]  Markus Vincze,et al.  Automation of “ground truth” annotation for multi-view RGB-D object instance recognition datasets , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Guohui Tian,et al.  Learning to Transform Service Instructions into Actions with Reinforcement Learning and Knowledge Base , 2018, Int. J. Autom. Comput..

[10]  De Xu,et al.  An Overview of Contour Detection Approaches , 2018, International Journal of Automation and Computing.

[11]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[12]  Takeo Kanade,et al.  Learning scene-specific pedestrian detectors without real data , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.

[14]  Thorsten Joachims,et al.  Semantic Labeling of 3D Point Clouds for Indoor Scenes , 2011, NIPS.

[15]  Bui Tuong Phong Illumination for computer generated pictures , 1975, Commun. ACM.

[16]  Manolis I. A. Lourakis,et al.  T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-Less Objects , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[17]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[20]  Siddhartha S. Srinivasa,et al.  The YCB object and Model set: Towards common benchmarks for manipulation research , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[21]  Jia Sun,et al.  Precision Work-piece Detection and Measurement Combining Top-down and Bottom-up Saliency , 2018, Int. J. Autom. Comput..

[22]  Qiang Yang,et al.  Transfer Hierarchical Attention Network for Generative Dialog System , 2019, Int. J. Autom. Comput..

[23]  Zhiwu Lu,et al.  Zero-shot Fine-grained Classification by Deep Feature Learning with Semantics , 2017, Int. J. Autom. Comput..

[24]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[25]  Quoc V. Le,et al.  Learning Data Augmentation Strategies for Object Detection , 2019, ECCV.

[26]  Russ Tedrake,et al.  Label Fusion: A Pipeline for Generating Ground Truth Labels for Real RGBD Data of Cluttered Scenes , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Stefan Greuter,et al.  Real-time procedural generation of `pseudo infinite' cities , 2003, GRAPHITE '03.

[29]  Matthew Johnson-Roberson,et al.  Driving in the Matrix: Can virtual worlds replace human-generated annotations for real world tasks? , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[30]  Ming-Ting Sun,et al.  Semantic Instance Annotation of Street Scenes by 3D to 2D Label Transfer , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Zhenjie Yao,et al.  Applying Deep Learning to Individual and Community Health Monitoring Data: A Survey , 2018, International Journal of Automation and Computing.

[32]  Rafael Bidarra,et al.  Procedural Generation of Dungeons , 2014, IEEE Transactions on Computational Intelligence and AI in Games.

[33]  Stanley T. Birchfield,et al.  Falling Things: A Synthetic Dataset for 3D Object Detection and Pose Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[34]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[35]  Stephen Tyree,et al.  Synthetically Trained Neural Networks for Learning Human-Readable Plans from Real-World Demonstrations , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[37]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[38]  Mouloud Koudil,et al.  A Novel Active Learning Method Using SVM for Text Classification , 2018, Int. J. Autom. Comput..

[39]  Chandrasekhar Kambhampati,et al.  Issues in the Mining of Heart Failure Datasets , 2014, Int. J. Autom. Comput..

[40]  Zhiyu Chen,et al.  Mask Editor : an Image Annotation Tool for Image Segmentation Tasks , 2018, ArXiv.

[41]  Abhishek Dutta,et al.  The VIA Annotation Software for Images, Audio and Video , 2019, ACM Multimedia.

[42]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.