论文信息 - Semi-supervised Learning by Latent Space Energy-Based Model of Symbol-Vector Coupling

Semi-supervised Learning by Latent Space Energy-Based Model of Symbol-Vector Coupling

This paper proposes a latent space energy-based prior model for semi-supervised learning. The model stands on a generator network that maps a latent vector to the observed example. The energy term of the prior model couples the latent vector and a symbolic one-hot vector, so that classification can be based on the latent vector inferred from the observed example. In our learning method, the symbol-vector coupling, the generator network and the inference network are learned jointly. Our method is applicable to semi-supervised learning in various data domains such as image, text, and tabular data. Our experiments demonstrate that our method performs well on semi-supervised learning tasks.

[1] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[2] Erik Nijkamp,et al. Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model , 2019, NeurIPS.

[3] Navdeep Jaitly,et al. Adversarial Autoencoders , 2015, ArXiv.

[4] Jun Zhu,et al. Triple Generative Adversarial Nets , 2017, NIPS.

[5] Colin Raffel,et al. Realistic Evaluation of Deep Semi-Supervised Learning Algorithms , 2018, NeurIPS.

[6] David Berthelot,et al. MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[7] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[8] 知秀柴田. 5分で分かる!? 有名論文ナナメ読み：Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[9] Yang Lu,et al. A Theory of Generative ConvNet , 2016, ICML.

[10] Prafulla Dhariwal,et al. Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[11] Samy Bengio,et al. Density estimation using Real NVP , 2016, ICLR.

[12] Zhuowen Tu,et al. Introspective Neural Networks for Generative Modeling , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13] Yoshua Bengio,et al. NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[14] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[15] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16] Xiang Zhang,et al. Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[17] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.

[18] Noah A. Smith,et al. Variational Pretraining for Semi-supervised Text Classification , 2019, ACL.

[19] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .

[20] Naftali Tishby,et al. Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).

[21] Shin Ishii,et al. Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23] Jan Kautz,et al. VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models , 2020, ArXiv.

[24] Mohammad Norouzi,et al. Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One , 2019, ICLR.

[25] Ying Nian Wu,et al. Generative Modeling of Convolutional Neural Networks , 2014, ICLR.

[26] Tian Han,et al. Divergence Triangle for Joint Training of Generator Model, Energy-Based Model, and Inferential Model , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28] Fan Yang,et al. Good Semi-supervised Learning That Requires a Bad GAN , 2017, NIPS.

[29] David Berthelot,et al. FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[30] Joelle Pineau,et al. Language GANs Falling Short , 2018, ICLR.

[31] Jan Kautz,et al. NCP-VAE: Variational Autoencoders with Noise Contrastive Priors , 2020, ArXiv.

[32] Tian Han,et al. Learning Latent Space Energy-Based Prior Model , 2020, NeurIPS.

[33] Tian Han,et al. Learning Latent Space Energy-Based Prior Model for Molecule Generation , 2020, ArXiv.

[34] Tian Han,et al. Joint Training of Variational Auto-Encoder and Latent Energy-Based Model , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[36] Timo Aila,et al. Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[37] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.

[38] Andrew M. Dai,et al. Flow Contrastive Estimation of Energy-Based Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Bernhard Schölkopf,et al. Learning with Local and Global Consistency , 2003, NIPS.

[40] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[41] Erik Nijkamp,et al. Learning Multi-layer Latent Variable Model via Variational Optimization of Short Run MCMC for Approximate Inference , 2019, ECCV.

[42] Zhuowen Tu,et al. Introspective Classification with Convolutional Nets , 2017, NIPS.

[43] Andrew Gordon Wilson,et al. Semi-Supervised Learning with Normalizing Flows , 2019, ICML.

[44] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.