Semi-supervised classification by reaching consensus among modalities

Deep learning has demonstrated abilities to learn complex structures, but they can be restricted by available data. Recently, Consensus Networks (CNs) were proposed to alleviate data sparsity by utilizing features from multiple modalities, but they too have been limited by the size of labeled data. In this paper, we extend CN to Transductive Consensus Networks (TCNs), suitable for semi-supervised learning. In TCNs, different modalities of input are compressed into latent representations, which we encourage to become indistinguishable during iterative adversarial training. To understand TCNs two mechanisms, consensus and classification, we put forward its three variants in ablation studies on these mechanisms. To further investigate TCN models, we treat the latent representations as probability distributions and measure their similarities as the negative relative Jensen-Shannon divergences. We show that a consensus state beneficial for classification desires a stable but imperfect similarity between the representations. Overall, TCNs outperform or align with the best benchmark algorithms given 20 to 200 labeled samples on the Bank Marketing and the DementiaBank datasets.

[1]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[2]  Frank Rudzicz,et al.  Learning multiview embeddings for assessing dementia , 2018, EMNLP.

[3]  Donald A. Adjeroh,et al.  Unified Deep Supervised Domain Adaptation and Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Jost Tobias Springenberg,et al.  Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks , 2015, ICLR.

[5]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[6]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[7]  Johannes R. Sveinsson,et al.  Parallel consensual neural networks , 1997, IEEE Trans. Neural Networks.

[8]  Jekaterina Novikova,et al.  Detecting cognitive impairments by agreeing on interpretations of linguistic features , 2018, NAACL.

[9]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[10]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[11]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[12]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[13]  Sebastian Nowozin,et al.  The Numerics of GANs , 2017, NIPS.

[14]  James R. Glass,et al.  Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data , 2018, ArXiv.

[15]  Fan Yang,et al.  Good Semi-supervised Learning That Requires a Bad GAN , 2017, NIPS.

[16]  C. Lee,et al.  Medical big data: promise and challenges , 2017, Kidney research and clinical practice.

[17]  J. Becker,et al.  The natural history of Alzheimer's disease. Description of study cohort and accuracy of diagnosis. , 1994, Archives of neurology.

[18]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[19]  Paulo Cortez,et al.  A data-driven approach to predict the success of bank telemarketing , 2014, Decis. Support Syst..

[20]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[21]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[24]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[25]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[26]  Changde Du,et al.  Semi-supervised Bayesian Deep Multi-modal Emotion Recognition , 2017, ArXiv.