A Novel Approach to Data Augmentation for Pavement Distress Segmentation

Abstract Accurate semantic segmentation ground-truths are difficult and expensive to obtain. On the other hand, the most promising approaches to automatically tackle this task, i.e. Deep Convolutional Neural Networks (CNNs), require high volumes of labeled data. We propose a new method based on deep learning for data augmentation in the context of semantic segmentation of highly-textured images. The method exploits a Generative Adversarial Network (GAN) to produce a semantic layout, then a texture synthesizer, based on a CNN, generates a new image according to the generated semantic layout and a reference real image taken from the training set. Even though our method is general and it can be utilized on a broad set of problems, we employed it on the real-world problem of detecting and localizing defects and cracks in road asphalts. We show how, starting from few labeled images, it is possible to augment small and long-tail datasets by producing new images with the associated semantic layouts. We prove the effectiveness of our approach by evaluating the performance of three different CNNs for semantic segmentation on the German Pavement Distress dataset and on a novel asphalt dataset collected by us. Results show a remarkable increase in performance, especially with low cardinality classes, when CNNs are trained on the augmented datasets with respect to original datasets.

[1]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[2]  Raimondo Schettini,et al.  Spatial Sampling Network for Fast Scene Understanding , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[3]  Yuval Elovici,et al.  DOPING: Generative Data Augmentation for Unsupervised Anomaly Detection with GAN , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[4]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Anders Krogh,et al.  A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[6]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[7]  Vincent Lepetit,et al.  Learning Separable Filters , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Eduardo Romera,et al.  ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation , 2018, IEEE Transactions on Intelligent Transportation Systems.

[9]  Alasdair Gilchrist Introducing Industry 4.0 , 2016 .

[10]  Fan Yang,et al.  Feature Pyramid and Hierarchical Boosting Network for Pavement Crack Detection , 2019, IEEE Transactions on Intelligent Transportation Systems.

[11]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[12]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[13]  Eugenio Culurciello,et al.  ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation , 2016, ArXiv.

[14]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.

[15]  Paolo Napoletano,et al.  Anomaly Detection in Nanofibrous Materials by CNN-Based Self-Similarity , 2018, Sensors.

[16]  Horst-Michael Groß,et al.  How to get pavement distress detection ready for deep learning? A systematic approach , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[17]  Brendan J. Frey,et al.  Adaptive dropout for training deep neural networks , 2013, NIPS.

[18]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[19]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[20]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[21]  Qi Tian,et al.  DisturbLabel: Regularizing CNN on the Loss Layer , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Jacob Abernethy,et al.  On Convergence and Stability of GANs , 2018 .

[23]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[25]  Carsten Steger,et al.  MVTec AD — A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Jitendra Malik,et al.  View Synthesis by Appearance Flow , 2016, ECCV.

[27]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[28]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[29]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[30]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[31]  Hans-Georg Kemper,et al.  Application-Pull and Technology-Push as Driving Forces for the Fourth Industrial Revolution , 2014 .

[32]  Amos J. Storkey,et al.  Data Augmentation Generative Adversarial Networks , 2017, ICLR 2018.

[33]  Paolo Napoletano,et al.  Benchmark Analysis of Representative Deep Neural Network Architectures , 2018, IEEE Access.

[34]  Lutz Prechelt,et al.  Early Stopping - But When? , 2012, Neural Networks: Tricks of the Trade.

[35]  Fan Meng,et al.  Automatic Road Crack Detection Using Random Structured Forests , 2016, IEEE Transactions on Intelligent Transportation Systems.

[36]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[37]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[38]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Qingquan Li,et al.  CrackTree: Automatic crack detection from pavement images , 2012, Pattern Recognit. Lett..

[40]  Leon A. Gatys,et al.  Texture Synthesis Using Convolutional Neural Networks , 2015, NIPS.

[41]  Trevor Darrell,et al.  Compositional GAN: Learning Conditional Image Composition , 2018, ArXiv.

[42]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[43]  Qi Tian,et al.  Image Classification and Retrieval are ONE , 2015, ICMR.

[44]  Alexandr A. Kalinin,et al.  Albumentations: fast and flexible image augmentations , 2018, Inf..

[45]  Jérôme Idier,et al.  Automatic Crack Detection on Two-Dimensional Pavement Images: An Algorithm Based on Minimal Path Selection , 2016, IEEE transactions on intelligent transportation systems (Print).

[46]  Rogério Schmidt Feris,et al.  Delta-encoder: an effective sample synthesis method for few-shot object recognition , 2018, NeurIPS.

[47]  Davide Mazzini,et al.  Guided Upsampling Network for Real-Time Semantic Segmentation , 2018, BMVC.