Semi-supervised Learning by Latent Space Energy-Based Model of Symbol-Vector Coupling

This paper proposes a latent space energy-based prior model for semi-supervised learning. The model stands on a generator network that maps a latent vector to the observed example. The energy term of the prior model couples the latent vector and a symbolic one-hot vector, so that classification can be based on the latent vector inferred from the observed example. In our learning method, the symbol-vector coupling, the generator network and the inference network are learned jointly. Our method is applicable to semi-supervised learning in various data domains such as image, text, and tabular data. Our experiments demonstrate that our method performs well on semi-supervised learning tasks.

[1]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[2]  Erik Nijkamp,et al.  Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model , 2019, NeurIPS.

[3]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[4]  Jun Zhu,et al.  Triple Generative Adversarial Nets , 2017, NIPS.

[5]  Colin Raffel,et al.  Realistic Evaluation of Deep Semi-Supervised Learning Algorithms , 2018, NeurIPS.

[6]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[7]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[8]  知秀 柴田 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[9]  Yang Lu,et al.  A Theory of Generative ConvNet , 2016, ICML.

[10]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[11]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[12]  Zhuowen Tu,et al.  Introspective Neural Networks for Generative Modeling , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[14]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[15]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[17]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[18]  Noah A. Smith,et al.  Variational Pretraining for Semi-supervised Text Classification , 2019, ACL.

[19]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[20]  Naftali Tishby,et al.  Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).

[21]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Jan Kautz,et al.  VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models , 2020, ArXiv.

[24]  Mohammad Norouzi,et al.  Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One , 2019, ICLR.

[25]  Ying Nian Wu,et al.  Generative Modeling of Convolutional Neural Networks , 2014, ICLR.

[26]  Tian Han,et al.  Divergence Triangle for Joint Training of Generator Model, Energy-Based Model, and Inferential Model , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28]  Fan Yang,et al.  Good Semi-supervised Learning That Requires a Bad GAN , 2017, NIPS.

[29]  David Berthelot,et al.  FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[30]  Joelle Pineau,et al.  Language GANs Falling Short , 2018, ICLR.

[31]  Jan Kautz,et al.  NCP-VAE: Variational Autoencoders with Noise Contrastive Priors , 2020, ArXiv.

[32]  Tian Han,et al.  Learning Latent Space Energy-Based Prior Model , 2020, NeurIPS.

[33]  Tian Han,et al.  Learning Latent Space Energy-Based Prior Model for Molecule Generation , 2020, ArXiv.

[34]  Tian Han,et al.  Joint Training of Variational Auto-Encoder and Latent Energy-Based Model , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[36]  Timo Aila,et al.  Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[37]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[38]  Andrew M. Dai,et al.  Flow Contrastive Estimation of Energy-Based Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[40]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[41]  Erik Nijkamp,et al.  Learning Multi-layer Latent Variable Model via Variational Optimization of Short Run MCMC for Approximate Inference , 2019, ECCV.

[42]  Zhuowen Tu,et al.  Introspective Classification with Convolutional Nets , 2017, NIPS.

[43]  Andrew Gordon Wilson,et al.  Semi-Supervised Learning with Normalizing Flows , 2019, ICML.

[44]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.