Pseudo-Labeling Using Gaussian Process for Semi-Supervised Deep Learning

The goal of semi-supervised learning is to improve the performance of supervised learning tasks using unlabeled data. Deep learning has a high demand for making use of large-scale unlabeled data. We propose a simple and novel method of utilizing unlabeled data for semi-supervised learning to improve the performance of the deep learning model. First, we train a Gaussian process classifier (GPC), and use its output on unlabeled data as a pseudo-label. Then, we pre-train the deep learning model with the pseudo-labeled data to initialize the parameters of the model. Finally, we fine-tune it with the labeled data. We apply this method to five classes of video from the UCF101 data set which contains a small amount of labeled data. The experiment results show significant performance improvements compared to those of the deep learning model without pre-training. We further study the pre-training effect using pseudo-labeled data with a different probability, and give some advice for practical application. Moreover, we propose to train GPC with stratified sampled data (SGPC) to reduce the computation time of GPC when there are large amounts of data. Finally, we generalize proposed pseudo-labeling method to any classifier having good performance and give some advice for pseudo-labeling method selection.

[1]  Hossein Mobahi,et al.  Deep Learning via Semi-supervised Embedding , 2012, Neural Networks: Tricks of the Trade.

[2]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  Tolga Tasdizen,et al.  Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning , 2016, NIPS.

[5]  Pascal Vincent,et al.  The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training , 2009, AISTATS.

[6]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[7]  Marc'Aurelio Ranzato,et al.  Semi-supervised learning of compact document representations with deep networks , 2008, ICML '08.

[8]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[9]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[10]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[11]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[12]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[13]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[14]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[15]  Tolga Tasdizen,et al.  Mutual exclusivity loss for semi-supervised deep learning , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[16]  Peter Glöckner,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2013 .

[17]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.