On semi-supervised learning

Semi-supervised learning deals with the problem of how, if possible, to take advantage of a huge amount of not classified data, to perform classification, in situations when, typically, the labelled data are few. Even though this is not always possible (it depends on how useful is to know the distribution of the unlabelled data in the inference of the labels), several algorithm have been proposed recently. A new algorithm is proposed, that under almost neccesary conditions, attains asymptotically the performance of the best theoretical rule, when the size of unlabeled data tends to infinity. The set of necessary assumptions, although reasonables, show that semi-parametric classification only works for very well conditioned problems.

[1]  A. Cuevas,et al.  On boundary estimation , 2004, Advances in Applied Probability.

[2]  H. Akaike A new look at the statistical model identification , 1974 .

[3]  Mikhail Belkin,et al.  Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.

[4]  Vittorio Castelli,et al.  The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter , 1996, IEEE Trans. Inf. Theory.

[5]  Shai Ben-David,et al.  Does Unlabeled Data Provably Help? Worst-case Analysis of the Sample Complexity of Semi-Supervised Learning , 2008, COLT.

[6]  H. J. Scudder,et al.  Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.

[7]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[8]  Larry A. Wasserman,et al.  Density-Sensitive Semisupervised Inference , 2012, ArXiv.

[9]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[10]  Robert D. Nowak,et al.  Unlabeled data: Now it helps, now it doesn't , 2008, NIPS.

[11]  Mikhail Belkin,et al.  Semi-Supervised Learning Using Sparse Eigenfunction Bases , 2009, AAAI Fall Symposium: Manifold Learning and Its Applications.

[12]  Luc Devroye,et al.  Lectures on the Nearest Neighbor Method , 2015 .

[13]  Ramesh Nallapati,et al.  A Comparative Study of Methods for Transductive Transfer Learning , 2007 .

[14]  A. Cuevas,et al.  A plug-in approach to support estimation , 1997 .

[15]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[16]  Gholamreza Haffari,et al.  Analysis of Semi-Supervised Learning with the Yarowsky Algorithm , 2007, UAI.

[17]  B. Abdous,et al.  On the strong uniform consistency of a new kernel density estimator , 1989 .

[18]  Stanley C. Fralick,et al.  Learning to recognize patterns without a teacher , 1967, IEEE Trans. Inf. Theory.

[19]  Larry A. Wasserman,et al.  Statistical Analysis of Semi-Supervised Regression , 2007, NIPS.

[20]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[21]  C. Thäle 50 years sets with positive reach -- a survey. , 2008 .

[22]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[23]  Thorsten Joachims,et al.  Transductive Support Vector Machines , 2006, Semi-Supervised Learning.

[24]  Paul Erdös,et al.  Some remarks on the measurability of certain sets , 1945 .

[25]  Franco Turini,et al.  Time-Annotated Sequences for Medical Data Mining , 2007 .

[26]  A. Cuevas,et al.  On Statistical Properties of Sets Fulfilling Rolling-Type Conditions , 2011, Advances in Applied Probability.

[27]  Vittorio Castelli,et al.  On the exponential value of labeled samples , 1995, Pattern Recognit. Lett..

[28]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[29]  Partha Niyogi,et al.  Manifold regularization and semi-supervised learning: some theoretical analyses , 2013, J. Mach. Learn. Res..

[30]  Alejandro Cholaquidis,et al.  ON POINCARÉ CONE PROPERTY , 2014, 1403.5459.

[31]  Thorsten Joachims,et al.  Transductive Learning via Spectral Graph Partitioning , 2003, ICML.

[32]  Ronald A. Cole,et al.  Spoken Letter Recognition , 1990, HLT.

[33]  A. Cuevas,et al.  Detection of low dimensionality and data denoising via set estimation techniques , 2017, 1702.05193.

[34]  A. Cuevas,et al.  Stochastic detection of some topological and geometric feature , 2017 .

[35]  ASHOK K. AGRAWALA,et al.  Learning with a probabilistic teacher , 1970, IEEE Trans. Inf. Theory.

[36]  Philippe Rigollet,et al.  Generalization Error Bounds in Semi-supervised Classification Under the Cluster Assumption , 2006, J. Mach. Learn. Res..