Albumentations: fast and flexible image augmentations

Data augmentation is a commonly used technique for increasing both the size and the diversity of labeled training sets by leveraging input transformations that preserve corresponding output labels. In computer vision, image augmentations have become a common implicit regularization technique to combat overfitting in deep learning models and are ubiquitously used to improve performance. While most deep learning frameworks implement basic image transformations, the list is typically limited to some variations of flipping, rotating, scaling, and cropping. Moreover, image processing speed varies in existing image augmentation libraries. We present Albumentations, a fast and flexible open source library for image augmentation with many various image transform operations available that is also an easy-to-use wrapper around other augmentation libraries. We discuss the design principles that drove the implementation of Albumentations and give an overview of the key features and distinct capabilities. Finally, we provide examples of image augmentations for different computer vision tasks and demonstrate that Albumentations is faster than other commonly used image augmentation tools on most image transform operations.

[1]  Anne E Carpenter,et al.  Nucleus segmentation across imaging experiments: the 2018 Data Science Bowl , 2019, Nature Methods.

[2]  Geoffrey E. Hinton,et al.  Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.

[3]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.

[4]  Chris Edwards,et al.  Growing pains for deep learning , 2015, Commun. ACM.

[5]  Christopher Ré,et al.  Learning to Compose Domain-Specific Transformations for Data Augmentation , 2017, NIPS.

[6]  Vladimir Iglovikov,et al.  Satellite Imagery Feature Detection using Deep Convolutional Neural Network: A Kaggle Competition , 2017, ArXiv.

[7]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[8]  Alexander Rakhlin,et al.  Angiodysplasia Detection and Localization Using Deep Convolutional Neural Networks , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[9]  Alexander Rakhlin,et al.  Paediatric Bone Age Assessment Using Deep Convolutional Neural Networks , 2017, DLMIA/ML-CDS@MICCAI.

[10]  Alexander Rakhlin,et al.  Pediatric Bone Age Assessment Using Deep Convolutional Neural Networks , 2017, bioRxiv.

[11]  Andrew G. Howard,et al.  Some Improvements on Deep Convolutional Neural Network Based Image Classification , 2013, ICLR.

[12]  Darrel C. Ince,et al.  The case for open computer programs , 2012, Nature.

[13]  Anne E Carpenter,et al.  Opportunities and obstacles for deep learning in biology and medicine , 2017, bioRxiv.

[14]  Douglas M. Hawkins,et al.  The Problem of Overfitting , 2004, J. Chem. Inf. Model..

[15]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[16]  Vladimir Iglovikov,et al.  Camera Model Identification Using Convolutional Neural Networks , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[17]  Ting Liu,et al.  Recent advances in convolutional neural networks , 2015, Pattern Recognit..

[18]  Peter Corcoran,et al.  Smart Augmentation Learning an Optimal Data Augmentation Strategy , 2017, IEEE Access.

[19]  Michael S. Lew,et al.  Deep learning for visual understanding: A review , 2016, Neurocomputing.

[20]  Andreas Holzinger,et al.  Biomedical image augmentation using Augmentor , 2019, Bioinform..

[21]  S LewMichael,et al.  Deep learning for visual understanding , 2016 .

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[24]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[25]  Satoshi Nakamura,et al.  A Framework for Knowing Who is Doing What in Aerial Surveillance Videos , 2019, IEEE Access.

[26]  Peter Kontschieder,et al.  The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  César Domínguez,et al.  CLoDSA: a tool for augmentation in classification, localization, detection, semantic segmentation and instance segmentation tasks , 2019, BMC Bioinformatics.

[28]  Alexander Rakhlin,et al.  Deep Convolutional Neural Networks for Breast Cancer Histology Image Analysis , 2018, bioRxiv.

[29]  Taghi M. Khoshgoftaar,et al.  A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.

[30]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[31]  Alexander Rakhlin,et al.  Automatic Instrument Segmentation in Robot-Assisted Surgery Using Deep Learning , 2018, bioRxiv.

[32]  Alexey A. Shvets,et al.  Feature Pyramid Network for Multi-class Land Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[33]  Pierre Alliez,et al.  Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[34]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[35]  Alexey Shvets,et al.  Fully Convolutional Network for Automatic Road Extraction from Satellite Imagery , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).