CLE_LMNN: A novel framework of LMNN based on clustering labeled examples

Abstract Distance metric learning is the task that aims to automate this process of learning task-specific distance functions in a supervised manner. In this paper, we study how to learn a Mahalanobis distance metric that can improve nearest neighbor classification. Our paper makes two contributions. First, we propose a novel framework named CLE_LMNN for Mahalanobis distance learning. CLE_LMNN builds on a recently proposed framework known as large margin nearest neighbor (LMNN) classification. Compared with LMNN, CLE_LMNN learns a Mahalanobis distance in a fine-grained way by first partitioning the labeled examples into subsets. As shown by our experiments, this fine-grained learning way is inclined to obtain a Mahalanobis distance more suitable for classification. Second, we present a novel algorithm named CLE for clustering labeled examples. Different from traditional unsupervised clustering algorithms, CLE fully employ the class information of labeled examples to effectively partition the set of examples with the same class into different subsets. To evaluate our proposed framework, we conduct extensive experiments on three real datasets. The experimental results show the effectiveness of CLE_LMNN when applied to classification.

[1]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[2]  Cordelia Schmid,et al.  Is that you? Metric learning approaches for face identification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[4]  Gang Hua,et al.  Discriminant Embedding for Local Image Descriptors , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Hongliang Yu,et al.  A study of supervised term weighting scheme for sentiment analysis , 2014, Expert Syst. Appl..

[6]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[7]  Olivier Chapelle,et al.  Training a Support Vector Machine in the Primal , 2007, Neural Computation.

[8]  Kilian Q. Weinberger,et al.  Fast solvers and efficient implementations for distance metric learning , 2008, ICML '08.

[9]  Amir Globerson,et al.  Metric Learning by Collapsing Classes , 2005, NIPS.

[10]  Matthew E. Taylor,et al.  Metric learning for reinforcement learning agents , 2011, AAMAS.

[11]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[12]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[13]  Naomie Salim,et al.  Detection of review spam: A survey , 2015, Expert Syst. Appl..

[14]  Guy Lebanon,et al.  Metric learning for text documents , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Brian Kulis,et al.  Metric Learning: A Survey , 2013, Found. Trends Mach. Learn..

[16]  Yoram Singer,et al.  Online and batch learning of pseudo-metrics , 2004, ICML.

[17]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[18]  Prateek Jain,et al.  Fast Similarity Search for Learned Metrics , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Jitendra Malik,et al.  Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[20]  Inderjit S. Dhillon,et al.  Inductive Regularized Learning of Kernel Functions , 2010, NIPS.

[21]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.

[22]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[23]  Du Tran,et al.  Human Activity Recognition with Metric Learning , 2008, ECCV.

[24]  Miguel Ángel Rodríguez-García,et al.  Feature-based opinion mining through ontologies , 2014, Expert Syst. Appl..

[25]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[26]  Inderjit S. Dhillon,et al.  Structured metric learning for high dimensional problems , 2008, KDD.

[27]  Mohak Shah,et al.  Evaluating Learning Algorithms: A Classification Perspective , 2011 .

[28]  Wei Liu,et al.  Learning Distance Metrics with Contextual Constraints for Image Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[29]  Inderjit S. Dhillon,et al.  Online Metric Learning and Fast Similarity Search , 2008, NIPS.