Online Learning of Deep Hybrid Architectures for Semi-supervised Categorization

A hybrid architecture is presented capable of online learning from both labeled and unlabeled samples. It combines both generative and discriminative objectives to derive a new variant of the Deep Belief Network, i.e., the Stacked Boltzmann Experts Network model. The model's training algorithm is built on principles developed from hybrid discriminative Boltzmann machines and composes deep architectures in a greedy fashion. It makes use of its inherent "layer-wise ensemble" nature to perform useful classification work. We (1) compare this architecture against a hybrid denoising autoencoder version of itself as well as several other models and (2) investigate training in the context of an incremental learning procedure. The best-performing hybrid model, the Stacked Boltzmann Experts Network, consistently outperforms all others.

[1]  Kenji Doya,et al.  Expected energy-based restricted Boltzmann machine for classification , 2015, Neural Networks.

[2]  Geoffrey E. Hinton,et al.  Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[3]  Cornelia Caragea,et al.  Researcher homepage classification using unlabeled data , 2013, WWW.

[4]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[5]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[6]  Gang Chen,et al.  Restricted Boltzmann Machine for Classification with Hierarchical Correlated Prior , 2015, ICLR.

[7]  Tao Liu,et al.  A Novel Text Classification Approach Based on Deep Belief Network , 2010, ICONIP.

[8]  Geoffrey E. Hinton What kind of graphical model is the brain? , 2005, IJCAI.

[9]  Geoffrey E. Hinton,et al.  Application of Deep Belief Networks for Natural Language Understanding , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[10]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[11]  Xiao Sun,et al.  Chinese Microblog Sentiment Classification Based on Deep Belief Nets with Extended Multi-Modality Features , 2014, 2014 IEEE International Conference on Data Mining Workshop.

[12]  Yoshua Bengio,et al.  Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[13]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[14]  Cornelia Caragea,et al.  Automatic Identification of Research Articles from Crawled Documents , 2014, WSDM 2014.

[15]  Jakub M. Tomczak,et al.  Classification Restricted Boltzmann Machine for comprehensible credit scoring model , 2015, Expert Syst. Appl..

[16]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[17]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[18]  Marc'Aurelio Ranzato,et al.  Semi-supervised learning of compact document representations with deep networks , 2008, ICML '08.

[19]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[20]  Hugo Larochelle,et al.  Classification of Sets using Restricted Boltzmann Machines , 2011, UAI.

[21]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[22]  Liping Shen,et al.  Attentiveness Detection Using Continuous Restricted Boltzmann Machine in E-Learning Environment , 2009, ICHL.

[23]  Alfredo De Santis,et al.  Network anomaly detection with the restricted Boltzmann machine , 2013, Neurocomputing.

[24]  Latifur Khan,et al.  Facing the reality of data stream classification: coping with scarcity of labeled data , 2012, Knowledge and Information Systems.

[25]  Tapani Raiko,et al.  Learning Deep Belief Networks from Non-stationary Streams , 2012, ICANN.

[26]  Ana Margarida de Jesus,et al.  Improving Methods for Single-label Text Categorization , 2007 .

[27]  Zhuowen Tu,et al.  Deeply-Supervised Nets , 2014, AISTATS.

[28]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[29]  Pascal Vincent,et al.  Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives , 2012, ArXiv.

[30]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[31]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[32]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[33]  Honglak Lee,et al.  Online Incremental Feature Learning with Denoising Autoencoders , 2012, AISTATS.

[34]  Geoffrey E. Hinton,et al.  Generative versus discriminative training of RBMs for classification of fMRI images , 2008, NIPS.

[35]  Yoshua Bengio,et al.  Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.

[36]  Peter Norvig,et al.  The Unreasonable Effectiveness of Data , 2009, IEEE Intelligent Systems.

[37]  Yadong Mu,et al.  Supervised deep learning with auxiliary networks , 2014, KDD.

[38]  Razvan Pascanu,et al.  Learning Algorithms for the Classification Restricted Boltzmann Machine , 2012, J. Mach. Learn. Res..

[39]  Tom Minka,et al.  Principled Hybrids of Generative and Discriminative Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[40]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[41]  Hugo Larochelle,et al.  An Infinite Restricted Boltzmann Machine , 2015, Neural Computation.

[42]  Hang Li,et al.  A Deep Architecture for Matching Short Texts , 2013, NIPS.

[43]  Jakub M. Tomczak Prediction of breast cancer recurrence using Classification Restricted Boltzmann Machine with Dropping , 2013, ArXiv.