Metric learning-based kernel transformer with triplets and label constraints for feature fusion

Abstract Feature fusion is an important skill to improve the performance in computer vision, the difficult problem of feature fusion is how to learn the complementary properties of different features. We recognize that feature fusion can benefit from kernel metric learning. Thus, a metric learning-based kernel transformer method for feature fusion is proposed in this paper. First, we propose a kernel transformer to convert data from data space to kernel space, which makes feature fusion and metric learning can be performed in the transformed kernel space. Second, in order to realize supervised learning, both triplets and label constraints are embedded into our model. Third, in order to solve the unknown kernel matrices, LogDet divergence is also introduced into our model. Finally, a complete optimization objective function is formed. Based on an alternating direction method of multipliers (ADMM) solver and the Karush-Kuhn-Tucker (KKT) theorem, the proposed optimization problem is solved with the rigorous theoretical analysis. Experimental results on image retrieval demonstrate the effectiveness of the proposed methods.

[1]  Inderjit S. Dhillon,et al.  Metric and Kernel Learning Using a Linear Transformation , 2009, J. Mach. Learn. Res..

[2]  Qing Wang,et al.  Distance metric optimization driven convolutional neural network for age invariant face recognition , 2018, Pattern Recognit..

[3]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Hong Yan,et al.  Directional Statistics-based Deep Metric Learning for Image Classification and Retrieval , 2018, Pattern Recognit..

[5]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[7]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  Yi-Gang Cen,et al.  SURF binarization and fast codebook construction for image retrieval , 2017, J. Vis. Commun. Image Represent..

[9]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[11]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[12]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Zhi Zhang,et al.  Supervised Deep Feature Embedding With Handcrafted Feature , 2019, IEEE Transactions on Image Processing.

[14]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Xiaogang Wang,et al.  Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-Temporal Path Proposals , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Ping Li,et al.  Gaussian process approach for metric learning , 2019, Pattern Recognit..

[17]  Ling-Yu Duan,et al.  Incorporating intra-class variance to fine-grained visual recognition , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[18]  Feng Zhou,et al.  Embedding Label Structures for Fine-Grained Feature Representation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Shuang Wu,et al.  Multimodal feature fusion for robust event detection in web videos , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Hamid Reza Karimi,et al.  LogDet Divergence-Based Metric Learning With Triplet Constraints and Its Applications , 2014, IEEE Transactions on Image Processing.

[22]  Jason Jianjun Gu,et al.  An Efficient Method for Traffic Sign Recognition Based on Extreme Learning Machine , 2017, IEEE Transactions on Cybernetics.

[23]  Fan Zhang,et al.  Distance learning by mining hard and easy negative samples for person re-identification , 2019, Pattern Recognit..

[24]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[25]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[26]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[27]  Inderjit S. Dhillon,et al.  Low-Rank Kernel Learning with Bregman Matrix Divergences , 2009, J. Mach. Learn. Res..