Online Multi-Modal Distance Metric Learning with Application to Image Retrieval

Distance metric learning (DML) is an important technique to improve similarity search in content-based image retrieval. Despite being studied extensively, most existing DML approaches typically adopt a single-modal learning framework that learns the distance metric on either a single feature type or a combined feature space where multiple types of features are simply concatenated. Such single-modal DML methods suffer from some critical limitations: (i) some type of features may significantly dominate the others in the DML task due to diverse feature representations; and (ii) learning a distance metric on the combined high-dimensional feature space can be extremely time-consuming using the naive feature concatenation approach. To address these limitations, in this paper, we investigate a novel scheme of online multi-modal distance metric learning (OMDML), which explores a unified two-level online learning scheme: (i) it learns to optimize a distance metric on each individual feature space; and (ii) then it learns to find the optimal combination of diverse types of features. To further reduce the expensive cost of DML on high-dimensional feature space, we propose a low-rank OMDML algorithm which not only significantly reduces the computational cost but also retains highly competing or even better learning accuracy. We conduct extensive experiments to evaluate the performance of the proposed algorithms for multi-modal image retrieval, in which encouraging results validate the effectiveness of the proposed technique.

[1]  Rong Jin,et al.  A unified log-based relevance feedback scheme for image retrieval , 2006 .

[2]  Badadapure Pravinkumar Rajkumar COLLABORATIVE IMAGE RETRIEVAL VIA REGULARIZED METRIC LEARNING , 2017 .

[3]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[4]  Chunyan Miao,et al.  Learning to name faces: a multimodal learning scheme for search-based face annotation , 2013, SIGIR.

[5]  Amir Globerson,et al.  Metric Learning by Collapsing Classes , 2005, NIPS.

[6]  Wen Gao,et al.  Multiview Metric Learning with Global Consistency and Local Smoothness , 2012, TIST.

[7]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[8]  Anil K. Jain,et al.  Image retrieval using color and shape , 1996, Pattern Recognit..

[9]  Rong Jin,et al.  A Boosting Framework for Visuality-Preserving Distance Metric Learning and Its Application to Medical Image Retrieval , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Anil K. Jain,et al.  Shape-Based Retrieval: A Case Study With Trademark Image Databases , 1998, Pattern Recognit..

[11]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[12]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Chunyan Miao,et al.  Online multimodal deep similarity learning with application to image retrieval , 2013, ACM Multimedia.

[14]  Rong Jin,et al.  Rank-based distance metric learning: An application to image retrieval , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[16]  Inderjit S. Dhillon,et al.  Online Metric Learning and Fast Similarity Search , 2008, NIPS.

[17]  Samy Bengio,et al.  A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Luo Si,et al.  Collaborative image retrieval via regularized metric learning , 2006, Multimedia Systems.

[19]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[20]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[21]  Koby Crammer,et al.  Confidence-weighted linear classification , 2008, ICML '08.

[22]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[23]  David J. Fleet,et al.  Hamming Distance Metric Learning , 2012, NIPS.

[24]  Wei Liu,et al.  Learning Distance Metrics with Contextual Constraints for Image Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[27]  Yann LeCun,et al.  Large Scale Online Learning , 2003, NIPS.

[28]  Melba M. Crawford,et al.  View Generation for Multiview Maximum Disagreement Based Active Learning for Hyperspectral Image Classification , 2012, IEEE Transactions on Geoscience and Remote Sensing.

[29]  Yi Liu,et al.  An Efficient Algorithm for Local Distance Metric Learning , 2006, AAAI.

[30]  Dimitrios Gunopulos,et al.  Locally Adaptive Metric Nearest-Neighbor Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Steven C. H. Hoi,et al.  Online multi-modal distance learning for scalable multimedia retrieval , 2013, WSDM.

[32]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  C H HoiSteven,et al.  A Unified Log-Based Relevance Feedback Scheme for Image Retrieval , 2006 .

[34]  Rong Jin,et al.  Online Multiple Kernel Classification , 2013, Machine Learning.

[35]  Olivier Buisson,et al.  Random maximum margin hashing , 2011, CVPR 2011.

[36]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[37]  Steven C. H. Hoi,et al.  LIBOL: a library for online learning algorithms , 2014, J. Mach. Learn. Res..

[38]  Rong Jin,et al.  Double Updating Online Learning , 2011, J. Mach. Learn. Res..

[39]  Ji Wan,et al.  SOML: Sparse Online Metric Learning with Application to Image Retrieval , 2014, AAAI.

[40]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[41]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[42]  Yi Li,et al.  The Relaxed Online Maximum Margin Algorithm , 1999, Machine Learning.

[43]  Tomer Hertz,et al.  Learning Distance Functions using Equivalence Relations , 2003, ICML.

[44]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[45]  John Shawe-Taylor,et al.  Two view learning: SVM-2K, Theory and Practice , 2005, NIPS.

[46]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[47]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[48]  Steven C. H. Hoi,et al.  Fast Bounded Online Gradient Descent Algorithms for Scalable Kernel-Based Online Learning , 2012, ICML.

[49]  Rong Jin,et al.  Online Multiple Kernel Learning: Algorithms and Mistake Bounds , 2010, ALT.

[50]  Wei Liu,et al.  Semi-supervised distance metric learning for collaborative image retrieval and clustering , 2010, ACM Trans. Multim. Comput. Commun. Appl..

[51]  Hong Chang,et al.  Kernel-based distance metric learning for content-based image retrieval , 2007, Image Vis. Comput..

[52]  TomasiCarlo,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000 .

[53]  Stéphane Marchand-Maillet,et al.  Information Fusion in Multimedia Information Retrieval , 2007, Adaptive Multimedia Retrieval.

[54]  Shotaro Akaho,et al.  A kernel method for canonical correlation analysis , 2006, ArXiv.

[55]  Ying He,et al.  Mining social images with distance metric learning for automated image tagging , 2011, WSDM '11.

[56]  Koby Crammer,et al.  Adaptive regularization of weight vectors , 2009, Machine Learning.

[57]  Shumeet Baluja,et al.  VisualRank: Applying PageRank to Large-Scale Image Search , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[59]  Samy Bengio,et al.  Large Scale Online Learning of Image Similarity Through Ranking , 2009, J. Mach. Learn. Res..

[60]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[62]  Rong Jin,et al.  Regularized Distance Metric Learning: Theory and Algorithm , 2009, NIPS.