论文信息 - Deep Neural Network Self-training Based on Unsupervised Learning and Dropout

Deep Neural Network Self-training Based on Unsupervised Learning and Dropout

In supervised learning methods, a large amount of labeled data is necessary to ﬁnd reliable classiﬁcation boundaries to train a classiﬁer. However, it is hard to obtain a large amount of labeled data in practice and it is time-consuming with a lot of cost to obtain labels of data. Although unlabeled data is comparatively plentiful than labeled data, most of supervised learning methods are not designed to exploit unlabeled data. Self-training is one of the semisupervised learning methods that alternatively repeat training a base classiﬁer and labeling unlabeled data in training set. Most self-training methods have adopted conﬁdence measures to select conﬁdently labeled examples because high-conﬁdence usually implies low error. A major difﬁculty of self-training is the error ampliﬁcation. If a classiﬁer misclassiﬁes some examples and the misclassiﬁed examples are included in the labeled training set, the next classiﬁer may learn improper classiﬁcation boundaries and generate more misclassiﬁed examples. Since base classiﬁers are built with small labeled dataset and are hard to earn good generalization performance due to the small labeled dataset. Although improving training procedure and the performance of classiﬁers, error occurrence is inevitable, so corrections of self-labeled data are necessary to avoid error ampliﬁcation in the following classiﬁers. In this paper, we propose a deep neural network based approach for alleviating the problems of self-training by combining schemes: pre-training, dropout and error forgetting. By applying combinations of these schemes to various dataset, a trained classiﬁer using our approach shows improved performance than trained classiﬁer using common self-training.

[1] Hamideh Afsarmanesh,et al. Semi-supervised self-training for decision tree classifiers , 2017, Int. J. Mach. Learn. Cybern..

[2] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.

[3] Dinggang Shen,et al. Robust Deep Learning for Improved Classification of AD/MCI Patients , 2014, MLMI.

[4] Uwe Mönks,et al. Sensorless drive diagnosis using automated feature extraction, significance ranking and reduction , 2013, 2013 IEEE 18th Conference on Emerging Technologies & Factory Automation (ETFA).

[5] Jie Li,et al. Understanding the dropout strategy and analyzing its effectiveness on LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6] Davide Anguita,et al. A Public Domain Dataset for Human Activity Recognition using Smartphones , 2013, ESANN.

[7] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[9] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[10] Volkmar Frinken,et al. Evaluating Retraining Rules for Semi-Supervised Learning in Neural Network Based Cursive Word Recognition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[11] Xiaojin Zhu,et al. Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[12] Yuanqing Li,et al. A self-training semi-supervised SVM algorithm and its application in an EEG-based brain computer interface speller system , 2008, Pattern Recognit. Lett..

[13] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[14] Alexander Zien,et al. Semi-Supervised Text Classification Using EM , 2006 .

[15] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[16] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[17] David Haussler,et al. Unsupervised learning of distributions on binary vectors using two layer networks , 1991, NIPS 1991.

[18] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[19] Paul Smolensky,et al. Information processing in dynamical systems: foundations of harmony theory , 1986 .

[20] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[21] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.