Multi-View Deep Metric Learning for Volumetric Image Recognition

This paper presents a multi-view deep metric learning (MVDML) architecture for the recognition of volumetric image stacks. Different from existing metric learning methods which aim to learn a Mahalanobis distance metric to maximize the inter-class variations and minimize the intra-class variations, the proposed multi-view deep metric learning approach learns a function that maps input volumetric images into a compact Euclidean space where distances approximate the “semantic” distances in the input space. The learning process minimizes a contrastive loss function that drives the similarity metric to be small for pairs of samples from same class, and large for pairs from different classes. The mapping from input to the target space is a multi-view convolutional neural network (MVCNN) which combines information from multiple views of a volumetric image into a single and compact feature descriptor. The experimental results on the nematode volumetric image database show that our proposed method outperforms models based on hand-crafted visual features, conventional metric learning methods and deep classification models.

[1]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Jiwen Lu,et al.  Sharable and Individual Multi-View Metric Learning , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Paul De Ley,et al.  Video capture and editing as a tool for the storage, distribution, and illustration of morphological characters of nematodes. , 2002, Journal of nematology.

[4]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Zhen Li,et al.  Learning Locally-Adaptive Decision Functions for Person Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Amit K. Roy-Chowdhury,et al.  Multilinear feature extraction and classification of multi-focal images, with applications in nematode taxonomy , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Mubarak Shah,et al.  A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.

[8]  Jiwen Lu,et al.  Discriminative Deep Metric Learning for Face Verification in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[10]  Gang Wang,et al.  Multi-manifold deep metric learning for image set classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[12]  Yan Liu,et al.  A new method of feature fusion and its application in image recognition , 2005, Pattern Recognit..

[13]  Kavita Bala,et al.  Learning visual similarity for product design with convolutional neural networks , 2015, ACM Trans. Graph..

[14]  Frédéric Jurie,et al.  PCCA: A new approach for distance learning from sparse pairwise constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Xiaoyan Liu,et al.  Multi-focal nematode image stack classification using a projection-based multi-linear method , 2017, Machine Vision and Applications.

[17]  Sergio A. Velastin,et al.  Local Fisher Discriminant Analysis for Pedestrian Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.