Learning Local Semantic Distances with Limited Supervision

Recent advances in distance function learning have demonstrated that learning a good distance metric can greatly improve the performance in a wide variety of tasks in data mining and web search. A major problem in such scenarios is the limited labeled knowledge available for learning the user intentions. Furthermore, distances are inherently local, where a single global distance function may not capture the distance structure well. A challenge here is that local distance learning is even harder when the labeled information available is limited, because the distance function varies with data locality. To address these issues, we propose a local metric learning algorithm termed Local Semantic Sensing (LSS), which augments the small amount of labeled data with unlabeled data in order to learn the semantic information in the manifold structure, and then integrated with supervised intentional knowledge in a local way. We present results in a retrieval application, which show that the approach significantly outperforms other state-of-the-art methods in the literature.

[1]  René Vidal,et al.  Sparse Manifold Clustering and Embedding , 2011, NIPS.

[2]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Jun Wang,et al.  Fast Pairwise Query Selection for Large-Scale Active Learning to Rank , 2013, 2013 IEEE 13th International Conference on Data Mining.

[4]  Zhen Li,et al.  Learning Locally-Adaptive Decision Functions for Person Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Misha Pavel,et al.  Adjustment Learning and Relevant Component Analysis , 2002, ECCV.

[6]  Zhen Li,et al.  Hierarchical Gaussianization for image classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[8]  Bernhard Schölkopf,et al.  Ranking on Data Manifolds , 2003, NIPS.

[9]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[10]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[11]  Fei Wang,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Active Learning from Relative Queries , 2022 .

[12]  Rong Jin,et al.  Distance Metric Learning: A Comprehensive Survey , 2006 .

[13]  I. Hassan Embedded , 2005, The Cyber Security Handbook.

[14]  Zhi-Hua Zhou,et al.  Learning instance specific distances using metric propagation , 2009, ICML '09.

[15]  Ying Liu,et al.  A survey of content-based image retrieval with high-level semantics , 2007, Pattern Recognit..

[16]  Fei Wang,et al.  Composite distance metric integration by leveraging multiple experts' inputs and its application in patient similarity assessment , 2012, Stat. Anal. Data Min..

[17]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[18]  Kilian Q. Weinberger,et al.  Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, CVPR.

[19]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Wei Liu,et al.  Semi-supervised distance metric learning for Collaborative Image Retrieval , 2008, CVPR.

[21]  Wei Liu,et al.  Learning Distance Metrics with Contextual Constraints for Image Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  James R. Foulds,et al.  Revisiting Multiple-Instance Learning Via Embedded Instance Selection , 2008, Australasian Conference on Artificial Intelligence.

[23]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[24]  Fei Wang,et al.  FeaFiner: biomarker identification from medical data through feature generalization and selection , 2013, KDD.

[25]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[26]  Charu C. Aggarwal,et al.  Transfer Learning of Distance Metrics by Cross-Domain Metric Sampling across Heterogeneous Spaces , 2012, SDM.

[27]  Peng Liu,et al.  Semi-supervised sparse metric learning using alternating linearization optimization , 2010, KDD.

[28]  Charu C. Aggarwal,et al.  Towards systematic design of distance functions for data mining applications , 2003, KDD '03.

[29]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[30]  Zhi-Hua Zhou,et al.  Query-Sensitive Similarity Measure for Content-Based Image Retrieval , 2006, Sixth International Conference on Data Mining (ICDM'06).

[31]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[32]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[33]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[35]  Chun Chen,et al.  Efficient manifold ranking for image retrieval , 2011, SIGIR.

[36]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[37]  Yuxiao Hu,et al.  Learning a Spatially Smooth Subspace for Face Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.