Filter-Invariant Image Classification on Social Media Photos

With the popularity of social media nowadays, tons of photos are uploaded everyday. To understand the image content, image classification becomes a very essential technique for plenty of applications (e.g., object detection, image caption generation). Convolutional Neural Network (CNN) has been shown as the state-of-the-art approach for image classification. However, one of the characteristics in social media photos is that they are often applied with photo filters, especially on Instagram. We find that prior works do not aware of this trend in social media photos and fail on filtered images. Thus, we propose a novel CNN architecture that utilizes the power of pairwise constraint by combining Siamese network and the proposed adaptive margin contrastive loss with our discriminative pair sampling method to solve the problem of filter bias. To the best of our knowledge, this is the first work to tackle filter bias on CNN and achieve state-of-the-art performance on a filtered subset of ILSVRC2012.

[1]  Jie Lin,et al.  DeepHash: Getting Regularization, Depth and Fine-Tuning Right , 2015, ArXiv.

[2]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[3]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[4]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[5]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[10]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[11]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[12]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Z. Jane Wang,et al.  An Adaptive Descriptor Design for Object Recognition in the Wild , 2013, 2013 IEEE International Conference on Computer Vision.

[14]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.