Sill-Net: Feature Augmentation with Separated Illumination Representation

For visual object recognition tasks, the illumination variations can cause distinct changes in object appearance and thus confuse the deep neural network based recognition models. Especially for some rare illumination conditions, collecting sufficient training samples could be time-consuming and expensive. To solve this problem, in this paper we propose a novel neural network architecture called Separating-Illumination Network (Sill-Net). Sill-Net learns to separate illumination features from images, and then during training we augment training samples with these separated illumination features in the feature space. Experimental results demonstrate that our approach outperforms current state-of-the-art methods in several object classification benchmarks.

[1]  Yoshua Bengio,et al.  A Closer Look at Memorization in Deep Networks , 2017, ICML.

[2]  Taghi M. Khoshgoftaar,et al.  A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.

[3]  Graham W. Taylor,et al.  Dataset Augmentation in Feature Space , 2017, ICLR.

[4]  Olivier Buisson,et al.  Logo retrieval with a contrario visual query expansion , 2009, ACM Multimedia.

[5]  Nicu Sebe,et al.  Deep Traffic Sign Detection and Recognition Without Target Domain Real Images , 2020, ArXiv.

[6]  Rainer Lienhart,et al.  Scalable logo recognition in real-world images , 2011, ICMR.

[7]  Liang Chen,et al.  GAN Augmentation: Augmenting Training Data using Generative Adversarial Networks , 2018, ArXiv.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Baoli Li,et al.  Traffic-Sign Detection and Classification in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Tae-Hyun Oh,et al.  Variational Prototyping-Encoder: One-Shot Learning With Prototypical Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[12]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[13]  Zhenguo Li,et al.  DecAug: Out-of-Distribution Generalization via Decomposed Feature Representation and Semantic Augmentation , 2020, AAAI.

[14]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[15]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[16]  Johannes Stallkamp,et al.  Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition , 2012, Neural Networks.

[17]  Tae-Hyun Oh,et al.  Co-domain Embedding using Deep Quadruplet Networks for Unseen Traffic Sign Recognition , 2017, AAAI.

[18]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[19]  Nikos Komodakis,et al.  Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[22]  Hiroshi Inoue,et al.  Data Augmentation by Pairing Samples for Images Classification , 2018, ArXiv.

[23]  Cheng Wu,et al.  Regularizing Deep Networks With Semantic Data Augmentation , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Olivier Buisson,et al.  Scalable mining of small visual objects , 2012, ACM Multimedia.

[25]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[26]  Aljosa Smolic,et al.  STaDA: Style Transfer as Data Augmentation , 2019, VISIGRAPP.

[27]  Quanming Yao,et al.  Decoupling Representation and Classifier for Noisy Label Learning , 2020, ArXiv.

[28]  Hong Yu,et al.  Meta Networks , 2017, ICML.

[29]  Jinwen Ma,et al.  ELEGANT: Exchanging Latent Encodings with GAN for Transferring Multiple Face Attributes , 2018, ECCV.

[30]  Stefan Bauer,et al.  Robustly Disentangled Causal Mechanisms: Validating Deep Representations for Interventional Robustness , 2018, ICML.

[31]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[34]  Shaogang Gong,et al.  Deep Learning Logo Detection with Data Expansion by Synthesising Context , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).