CoCoNet: A Collaborative Convolutional Network applied to fine-grained bird species classification

We present an end-to-end deep network for fine-grained visual categorization called Collaborative Convolutional Network (CoCoNet). The network uses a collaborative layer after the convolutional layers to represent an image as an optimal weighted collaboration of features learned from training samples as a whole rather than one at a time. This gives CoCoNet more power to encode the fine-grained nature of the data with limited samples. We perform a detailed study of the performance with 1-stage and 2-stage transfer learning. The ablation study shows that the proposed method outperforms its constituent parts consistently. CoCoNet also outperforms few state-of-the-art competing methods. Experiments have been performed on the fine-grained bird species classification problem as a representative example, but the method may be applied to other similar tasks. We also introduce a new public dataset for fine-grained species recognition, that of Indian endemic birds and have reported initial results on it.

[1]  Yuning Chai,et al.  Advances in fine-grained visual categorization , 2015 .

[2]  Mehrbakhsh Nilashi,et al.  Collaborative filtering recommender systems , 2013 .

[3]  Marcel Simon,et al.  Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Joachim Denzler,et al.  Fine-grained Recognition Datasets for Biodiversity Analysis , 2015, ArXiv.

[5]  Subhransu Maji,et al.  Bilinear CNN Models for Fine-Grained Visual Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Umapada Pal,et al.  A Generalised Formulation for Collaborative Representation of Image Patches (GP-CRC) , 2017, BMVC.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Jonathan Krause,et al.  Learning Features and Parts for Fine-Grained Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[9]  Qixiang Ye,et al.  Selective Sparse Sampling for Fine-Grained Image Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Tao Mei,et al.  Destruction and Construction Learning for Fine-Grained Image Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Umapada Pal,et al.  Collaborative representation based fine-grained species recognition , 2016, 2016 International Conference on Image and Vision Computing New Zealand (IVCNZ).

[12]  Pietro Perona,et al.  The Devil is in the Tails: Fine-grained Classification in the Wild , 2017, ArXiv.

[13]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[14]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[16]  Subhransu Maji,et al.  Bilinear Convolutional Neural Networks for Fine-Grained Visual Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Jiebo Luo,et al.  Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Dacheng Tao,et al.  Learning a Mixture of Granularity-Specific Experts for Fine-Grained Categorization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[20]  Lei Zhang,et al.  Sparse representation or collaborative representation: Which helps face recognition? , 2011, 2011 International Conference on Computer Vision.

[21]  Jonathan Krause,et al.  Fine-grained recognition without part annotations , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Pietro Perona,et al.  Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Zhiwu Lu,et al.  Zero-shot Fine-grained Classification by Deep Feature Learning with Semantics , 2017, Int. J. Autom. Comput..

[24]  Dong Wang,et al.  Learning to Navigate for Fine-grained Classification , 2018, ECCV.

[25]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[26]  Larry S. Davis,et al.  Cross-X Learning for Fine-Grained Visual Categorization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Yicong Zhou,et al.  An extended probabilistic collaborative representation based classifier for image classification , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[28]  Lei Zhang,et al.  A Probabilistic Collaborative Representation Based Approach for Pattern Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[30]  Subhransu Maji,et al.  Improved Bilinear Pooling with CNNs , 2017, BMVC.

[31]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.