Deep Neighborhood Component Analysis for Visual Similarity Modeling

Learning effective visual similarity is an essential problem in multimedia research. Despite the promising progress made in recent years, most existing approaches learn visual features and similarities in two separate stages, which inevitably limits their performance. Once useful information has been lost in the feature extraction stage, it can hardly be recovered later. This article proposes a novel end-to-end approach for visual similarity modeling, called deep neighborhood component analysis, which discriminatively trains deep neural networks to jointly learn visual features and similarities. Specifically, we first formulate a metric learning objective that maximizes the intra-class correlations and minimizes the inter-class correlations under the neighborhood component analysis criterion, and then train deep convolutional neural networks to learn a nonlinear mapping that projects visual instances from original feature space to a discriminative and neighborhood-structure-preserving embedding space, thus resulting in better performance. We conducted extensive evaluations on several widely used and challenging datasets, and the impressive results demonstrate the effectiveness of our proposed approach.

[1]  Yue Gao,et al.  View-Based Discriminative Probabilistic Modeling for 3D Object Retrieval and Recognition , 2013, IEEE Transactions on Image Processing.

[2]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[3]  Yun Fu,et al.  Robust Transfer Metric Learning for Image Classification , 2017, IEEE Transactions on Image Processing.

[4]  Cordelia Schmid,et al.  Is that you? Metric learning approaches for face identification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5]  Matti Pietikäinen,et al.  A comparative study of texture measures with classification based on featured distributions , 1996, Pattern Recognit..

[6]  Jiwen Lu,et al.  Hardness-Aware Deep Metric Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Marc Sebban,et al.  A Survey on Metric Learning for Feature Vectors and Structured Data , 2013, ArXiv.

[8]  Fuzhen Zhuang,et al.  Supervised Representation Learning with Double Encoding-Layer Autoencoder for Transfer Learning , 2017, ACM Trans. Intell. Syst. Technol..

[9]  Yair Movshovitz-Attias,et al.  No Fuss Distance Metric Learning Using Proxies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Rong Jin,et al.  Distance Metric Learning: A Comprehensive Survey , 2006 .

[11]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[12]  Yao Zhao,et al.  Modality-Dependent Cross-Media Retrieval , 2015, ACM Trans. Intell. Syst. Technol..

[13]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Meng Wang,et al.  Person Reidentification via Structural Deep Metric Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Geoffrey E. Hinton,et al.  Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure , 2007, AISTATS.

[16]  LiuWei,et al.  Semi-supervised distance metric learning for collaborative image retrieval and clustering , 2010 .

[17]  足立 浩平 Ingwer Borg and Patrick Groenen--Modern Multidimensional Scaling-Theory and Applications , 1998 .

[18]  Meng Wang,et al.  Person Re-Identification With Metric Learning Using Privileged Information , 2018, IEEE Transactions on Image Processing.

[19]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[20]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[21]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[22]  Lorenzo Torresani,et al.  Large Margin Component Analysis , 2006, NIPS.

[23]  Michael Jones,et al.  An improved deep learning architecture for person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Ling Shao,et al.  Unsupervised Deep Video Hashing via Balanced Code for Large-Scale Video Retrieval , 2019, IEEE Transactions on Image Processing.

[25]  Jiwen Lu,et al.  Deep transfer metric learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Li Bai,et al.  Cosine Similarity Metric Learning for Face Verification , 2010, ACCV.

[27]  Nan Jiang,et al.  Data-Driven Spatially-Adaptive Metric Adjustment for Visual Tracking , 2014, IEEE Transactions on Image Processing.

[28]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[29]  Meng Wang,et al.  Learning Visual Semantic Relationships for Efficient Visual Retrieval , 2015, IEEE Transactions on Big Data.

[30]  Qiang Ni,et al.  Joint Image-Text Hashing for Fast Large-Scale Cross-Media Retrieval Using Self-Supervised Deep Learning , 2019, IEEE Transactions on Industrial Electronics.

[31]  Meng Wang,et al.  Visual Classification by ℓ1-Hypergraph Modeling , 2015, IEEE Trans. Knowl. Data Eng..

[32]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[33]  Liang-Tien Chia,et al.  Learning image-to-class distance metric for image classification , 2013, TIST.

[34]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[35]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[36]  Alberto Del Bimbo,et al.  Socializing the Semantic Gap , 2015, ACM Comput. Surv..

[37]  Chunheng Wang,et al.  Deep nonlinear metric learning with independent subspace analysis for face verification , 2012, ACM Multimedia.

[38]  Bo Geng,et al.  DAML: Domain Adaptation Metric Learning , 2011, IEEE Transactions on Image Processing.

[39]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Jiwen Lu,et al.  Discriminative Deep Metric Learning for Face and Kinship Verification , 2017, IEEE Transactions on Image Processing.

[41]  Brian Kulis,et al.  Metric Learning: A Survey , 2013, Found. Trends Mach. Learn..

[42]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[43]  Jiwen Lu,et al.  Deep Metric Learning for Visual Tracking , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[44]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[45]  Meng Wang,et al.  Movie2Comics: Towards a Lively Video Content Presentation , 2012, IEEE Transactions on Multimedia.

[46]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[47]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[48]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Samuel Kaski,et al.  Informative Discriminant Analysis , 2003, ICML.

[50]  Fei Wang,et al.  Survey on distance metric learning and dimensionality reduction in data mining , 2014, Data Mining and Knowledge Discovery.

[51]  Jungong Han,et al.  Robust Quantization for General Similarity Search , 2018, IEEE Transactions on Image Processing.

[52]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[54]  Jiwen Lu,et al.  Deep Metric Learning for Visual Understanding: An Overview of Recent Advances , 2017, IEEE Signal Processing Magazine.

[55]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[56]  Meng Wang,et al.  Coherent Semantic-Visual Indexing for Large-Scale Image Retrieval in the Cloud , 2017, IEEE Transactions on Image Processing.

[57]  Mukund Balasubramanian,et al.  The Isomap Algorithm and Topological Stability , 2002, Science.