DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer

We have witnessed rapid evolution of deep neural network architecture design in the past years. These latest progresses greatly facilitate the developments in various areas such as computer vision and natural language processing. However, along with the extraordinary performance, these state-of-the-art models also bring in expensive computational cost. Directly deploying these models into applications with real-time requirement is still infeasible. Recently, Hinton etal. have shown that the dark knowledge within a powerful teacher model can significantly help the training of a smaller and faster student network. These knowledge are vastly beneficial to improve the generalization ability of the student model. Inspired by their work, we introduce a new type of knowledge -- cross sample similarities for model compression and acceleration. This knowledge can be naturally derived from deep metric learning model. To transfer them, we bring the "learning to rank" technique into deep metric learning formulation. We test our proposed DarkRank method on various metric learning tasks including pedestrian re-identification, image retrieval and image clustering. The results are quite encouraging. Our method can improve over the baseline method by a large margin. Moreover, it is fully compatible with other existing methods. When combined, the performance can be further boosted.

[1]  Nikos Komodakis,et al.  Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.

[2]  Rong Jin,et al.  Fine-grained visual categorization via multi-stage metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Amnon Shashua,et al.  Ranking with Large Margin Principle: Two Approaches , 2002, NIPS.

[5]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[6]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Quoc V. Le,et al.  Learning to Rank with Nonsmooth Cost Functions , 2006, Neural Information Processing Systems.

[9]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[10]  Nanning Zheng,et al.  Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[12]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Yunhong Wang,et al.  Relevance Metric Learning for Person Re-Identification by Exploiting Listwise Similarities , 2015, IEEE Transactions on Image Processing.

[14]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[15]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[16]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[17]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[19]  Carlos D. Castillo,et al.  L2-constrained Softmax Loss for Discriminative Face Verification , 2017, ArXiv.

[20]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[21]  Lin Xiong,et al.  A Good Practice Towards Top Performance of Face Recognition: Transferred Deep Feature Fusion , 2017, ArXiv.

[22]  J. Koenderink Q… , 2014, Les noms officiels des communes de Wallonie, de Bruxelles-Capitale et de la communaute germanophone.

[23]  Meng Yang,et al.  Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.

[24]  Dong Liu,et al.  Multi-Scale Triplet CNN for Person Re-Identification , 2016, ACM Multimedia.

[25]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[27]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[28]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Yoshua Bengio,et al.  FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[30]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[31]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[33]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[34]  Ivor W. Tsang,et al.  Learning with Idealized Kernels , 2003, ICML.

[35]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[36]  A. Choromańska Extreme Multi Class Classification , 2013 .

[37]  Qi Tian,et al.  Scalable Person Re-identification on Supervised Smoothed Manifold , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[39]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[40]  David Zhang,et al.  Joint Learning of Single-Image and Cross-Image Representations for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Tong Zhang,et al.  Subset Ranking Using Regression , 2006, COLT.

[42]  Naiyan Wang,et al.  Like What You Like: Knowledge Distill via Neuron Selectivity Transfer , 2017, ArXiv.

[43]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[45]  Razvan Pascanu,et al.  Sobolev Training for Neural Networks , 2017, NIPS.

[46]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[47]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[48]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[49]  Nitish Srivastava Unsupervised Learning of Visual Representations using Videos , 2015 .