Greedy multi-class label propagation

In many real-world applications such as image classification, labeled training examples are difficult to obtain while unlabeled examples are readily available. In this context, semi-supervised learning methods take advantage of both labeled and unlabeled examples. In this paper, a greedy graph-based semi-supervised learning (GGSL) approach is proposed for multi-class classification problems. The labels are propagated through different graphs, which are obtained with neighborhoods of different sizes. The method assumes that nearby points share the same label, by starting with a small neighborhood where a reliable decision can be obtained, and iterates with larger neighborhoods where more examples are needed to determine the label of an example. The experimental results on toy data-sets and real data-sets, such as handwritten digit recognition, demonstrate the effectiveness of the proposed approach if a well chosen distance is used. Finally, the method does not require the tuning of hyper-parameters. We show that it is possible to achieve a recognition rate of 97.16% on handwritten digits (MNIST) while considering only one labeled example per class in the training data-set.

[1]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[2]  Ching Y. Suen,et al.  A novel hybrid CNN-SVM classifier for recognizing handwritten digits , 2012, Pattern Recognit..

[3]  Ching Y. Suen,et al.  A trainable feature extractor for handwritten digit recognition , 2007, Pattern Recognit..

[4]  Bernhard Schölkopf,et al.  Training Invariant Support Vector Machines , 2002, Machine Learning.

[5]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2006, IEEE Transactions on Knowledge and Data Engineering.

[6]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[7]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[8]  Wei Liu,et al.  Robust and Scalable Graph-Based Semisupervised Learning , 2012, Proceedings of the IEEE.

[9]  Xinhua Zhang,et al.  Hyperparameter Learning for Graph Based Semi-supervised Learning Algorithms , 2006, NIPS.

[10]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[12]  Mikhail Belkin,et al.  Manifold Regularization : A Geometric Framework for Learning from Examples , 2004 .

[13]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[14]  Mikhail Belkin,et al.  Beyond the point cloud: from transductive to semi-supervised learning , 2005, ICML.

[15]  Hermann Ney,et al.  Deformation Models for Image Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[17]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Xiaobo Zhou,et al.  Active microscopic cellular image annotation by superposable graph transduction with imbalanced labels , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[20]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[21]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[22]  Nicolas Le Roux,et al.  Label Propagation and Quadratic Criterion , 2006, Semi-Supervised Learning.