Adaptive conformal semi-supervised vector quantization for dissimilarity data

Existing semi-supervised learning algorithms focus on vectorial data given in Euclidean space. But many real life data are non-metric, given as (dis-)similarities which are not widely addressed. We propose a conformal prototype-based classifier for dissimilarity data to semi-supervised tasks. A 'secure region' of unlabeled data is identified to improve the trained model based on labeled data and to adapt the model complexity. The new approach (i) can directly deal with arbitrary symmetric dissimilarity matrices, (ii) offers intuitive classification by sparse prototypes, (iii) adapts the model complexity. Experiments confirm the effectiveness of our approach in comparison to state-of-the-art methods.

[1]  Francisco Jesús Martínez-Murcia,et al.  LVQ-SVM based CAD tool applied to structural MRI for the diagnosis of the Alzheimer's disease , 2013, Pattern Recognit. Lett..

[2]  A. Kai Qin,et al.  A novel kernel prototype-based learning algorithm , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[3]  Michael Biehl,et al.  Dynamics and Generalization Ability of LVQ Algorithms , 2007, J. Mach. Learn. Res..

[4]  Slobodan Vucetic,et al.  Regression Learning Vector Quantization , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[5]  Mario Vento,et al.  Reliability Parameters to Improve Combination Strategies in Multi-Expert Systems , 1999, Pattern Analysis & Applications.

[6]  Klaus Obermayer,et al.  Soft Learning Vector Quantization , 2003, Neural Computation.

[7]  Frank-Michael Schleif,et al.  Supervised data analysis and reliability estimation for spectral data , 2009 .

[8]  Antonello Rizzi International Joint Conference on Computational Intelligence - IJCCI 2014 - European Project Space Chair , 2014, IJCCI 2014.

[9]  Harris Papadopoulos,et al.  Inductive Confidence Machines for Regression , 2002, ECML.

[10]  Horst Bunke,et al.  On Not Making Dissimilarities Euclidean , 2004, SSPR/SPR.

[11]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[12]  Zhi-Hua Zhou,et al.  Towards Making Unlabeled Data Never Hurt , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Frank-Michael Schleif,et al.  Secure Semi-supervised Vector Quantization for Dissimilarity Data , 2013, IWANN.

[14]  Vladimir Vovk,et al.  A tutorial on conformal prediction , 2007, J. Mach. Learn. Res..

[15]  G. Shafer,et al.  Algorithmic Learning in a Random World , 2005 .

[16]  T. Maier,et al.  Fast and reliable MALDI-TOF MS–based microorganism identification , 2006 .

[17]  Thomas Villmann,et al.  On the Generalization Ability of GRLVQ Networks , 2005, Neural Processing Letters.

[18]  Hugo Terashima-Marín,et al.  Learning vector quantization for variable ordering in constraint satisfaction problems , 2013, Pattern Recognit. Lett..

[19]  Carey E. Priebe,et al.  Semisupervised learning from dissimilarity data , 2008, Comput. Stat. Data Anal..

[20]  Michael E. Tipping Sparse Bayesian Learning and the Relevance Vector Machine , 2001, J. Mach. Learn. Res..

[21]  Robert D. Nowak,et al.  Unlabeled data: Now it helps, now it doesn't , 2008, NIPS.

[22]  Horst Bunke,et al.  Edit distance-based kernel functions for structural pattern classification , 2006, Pattern Recognit..

[23]  Thomas Villmann,et al.  Supervised data analysis and reliability estimation with exemplary application for spectral data , 2009, Neurocomputing.

[24]  Ivan Marsic,et al.  Sparse semi-supervised learning on low-rank kernel , 2014, Neurocomputing.

[25]  SongEnmin,et al.  Semi-supervised multi-class Adaboost by exploiting unlabeled data , 2011 .

[26]  Frank-Michael Schleif,et al.  Learning vector quantization for (dis-)similarities , 2014, Neurocomputing.

[27]  Barbara Hammer,et al.  Graph-Based Representation of Symbolic Musical Data , 2009, GbRPR.

[28]  Anil K. Jain,et al.  Representation and Recognition of Handwritten Digits Using Deformable Templates , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Atsushi Sato,et al.  Generalized Learning Vector Quantization , 1995, NIPS.

[30]  Ponnuthurai Nagaratnam Suganthan,et al.  A novel kernel prototype-based learning algorithm , 2004, ICPR 2004.

[31]  Slobodan Vucetic,et al.  Learning Vector Quantization with adaptive prototype addition and removal , 2009, 2009 International Joint Conference on Neural Networks.

[32]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[33]  Slobodan Vucetic,et al.  Decentralized Estimation using distortion sensitive learning vector quantization , 2013, Pattern Recognit. Lett..

[34]  Robert P. W. Duin,et al.  Semi-supervised hyperspectral pixel classification using interactive labeling , 2011, 2011 3rd Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS).

[35]  R. Duin,et al.  The dissimilarity representation for pattern recognition , a tutorial , 2009 .

[36]  Paul M. B. Vitányi,et al.  Clustering by compression , 2003, IEEE Transactions on Information Theory.

[37]  Vladimir Vovk,et al.  Cross-conformal predictors , 2012, Annals of Mathematics and Artificial Intelligence.

[38]  Gail A. Carpenter,et al.  Self-supervised ARTMAP , 2010, Neural Networks.

[39]  Alfredo Vellido,et al.  Semi-supervised geodesic Generative Topographic Mapping , 2010, Pattern Recognit. Lett..

[40]  Mario Vento,et al.  To reject or not to reject: that is the question-an answer in case of neural classifiers , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[41]  Stéphane Canu,et al.  A multiple kernel framework for inductive semi-supervised SVM learning , 2012, Neurocomputing.

[42]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[43]  Jun Suzuki,et al.  Semi-Supervised Structured Output Learning Based on a Hybrid Generative and Discriminative Approach , 2007, EMNLP.

[44]  Marco Saerens,et al.  Semi-supervised classification and betweenness computation on large, sparse, directed graphs , 2011, Pattern Recognit..

[45]  Yi Peng,et al.  Unsupervised and Semi-supervised Support Vector Machines , 2011 .

[46]  Alexander Gammerman,et al.  Transductive Confidence Machines for Pattern Recognition , 2002, ECML.

[47]  Hamideh Afsarmanesh,et al.  Boosting for multiclass semi-supervised learning , 2014, Pattern Recognit. Lett..

[48]  Barbara Hammer,et al.  Topographic Mapping of Large Dissimilarity Data Sets , 2010, Neural Computation.

[49]  Claus Bahlmann,et al.  Learning with Distance Substitution Kernels , 2004, DAGM-Symposium.

[50]  Robert P. W. Duin,et al.  The Dissimilarity Representation for Pattern Recognition - Foundations and Applications , 2005, Series in Machine Perception and Artificial Intelligence.

[51]  Chih-Cheng Hung,et al.  Semi-supervised multi-class Adaboost by exploiting unlabeled data , 2011, Expert Syst. Appl..

[52]  Frank-Michael Schleif,et al.  Sparse conformal prediction for dissimilarity data , 2014, Annals of Mathematics and Artificial Intelligence.

[53]  Sungzoon Cho,et al.  Application of LVQ to novelty detection using outlier training data , 2006, Pattern Recognit. Lett..

[54]  Oliver Kramer,et al.  Sparse Quasi-Newton Optimization for Semi-supervised Support Vector Machines , 2012, ICPRAM.

[55]  Davide Bacciu,et al.  Expansive competitive learning for kernel vector quantization , 2009, Pattern Recognit. Lett..

[56]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[57]  Frank-Michael Schleif,et al.  Semi-Supervised Vector Quantization for proximity data , 2013, ESANN.

[58]  Dale Schuurmans,et al.  Unsupervised and Semi-Supervised Multi-Class Support Vector Machines , 2005, AAAI.

[59]  Vladimir Vovk,et al.  Conditional validity of inductive conformal predictors , 2012, Machine Learning.

[60]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[61]  Frank-Michael Schleif,et al.  Approximation techniques for clustering dissimilarity data , 2012, Neurocomputing.

[62]  Anil K. Ghosh,et al.  A probabilistic approach for semi-supervised nearest neighbor classification , 2012, Pattern Recognit. Lett..

[63]  Jeff A. Bilmes,et al.  Semi-Supervised Learning with Measure Propagation , 2011, J. Mach. Learn. Res..