Learning Robust Representations via Multi-View Information Bottleneck

The information bottleneck method provides an information-theoretic view of representation learning. The original formulation, however, can only be applied in the supervised setting where task-specific labels are available at learning time. We extend this method to the unsupervised setting, by taking advantage of multi-view data, which provides two views of the same underlying entity. A theoretical analysis leads to the definition of a new multi-view model which produces state-of-the-art results on two standard multi-view datasets, Sketchy and MIR-Flickr. We also extend our theory to the single-view setting by taking advantage of standard data augmentation techniques, empirically showing better generalization capabilities when compared to traditional unsupervised approaches.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Sriram Vishwanath,et al.  Learning Representations by Maximizing Mutual Information in Variational Autoencoders , 2019, 2020 IEEE International Symposium on Information Theory (ISIT).

[3]  Max Welling,et al.  VAE with a VampPrior , 2017, AISTATS.

[4]  James Hays,et al.  The sketchy database , 2016, ACM Trans. Graph..

[5]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[6]  Ling Shao,et al.  Generative Domain-Migration Hashing for Sketch-to-Image Retrieval , 2018, ECCV.

[7]  David Barber,et al.  The IM algorithm: a variational approach to Information Maximization , 2003, NIPS 2003.

[8]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[9]  Honglak Lee,et al.  Deep Variational Canonical Correlation Analysis , 2016, ArXiv.

[10]  Honggang Zhang,et al.  Sketch-based image retrieval via Siamese convolutional neural network , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[11]  Tao Xiang,et al.  Sketch-a-Net that Beats Humans , 2015, BMVC.

[12]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[13]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[15]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[16]  Mu Zhu,et al.  A Relationship between the Average Precision and the Area Under the ROC Curve , 2015, ICTIR.

[17]  Shiliang Sun,et al.  Multi-view learning overview: Recent progress and new challenges , 2017, Inf. Fusion.

[18]  Yoshua Bengio,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[19]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[20]  Alexander A. Alemi,et al.  On Variational Bounds of Mutual Information , 2019, ICML.

[21]  Guillaume Desjardins,et al.  Understanding disentangling in β-VAE , 2018, ArXiv.

[22]  Ralph Linsker,et al.  Self-organization in a perceptual network , 1988, Computer.

[23]  Alexander A. Alemi,et al.  Fixing a Broken ELBO , 2017, ICML.

[24]  Naftali Tishby,et al.  Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).

[25]  Honglak Lee,et al.  Improved Multimodal Deep Learning with Variation of Information , 2014, NIPS.

[26]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[27]  Aaron C. Courville,et al.  MINE: Mutual Information Neural Estimation , 2018, ArXiv.

[28]  Jiayu Zhou,et al.  Deep Multi-view Information Bottleneck , 2019, SDM.

[29]  David Vázquez,et al.  PixelVAE: A Latent Variable Model for Natural Images , 2016, ICLR.

[30]  Michael Tschannen,et al.  On Mutual Information Maximization for Representation Learning , 2019, ICLR.

[31]  Phil Blunsom,et al.  Multilingual Distributed Representations without Word Alignment , 2013, ICLR 2014.

[32]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[33]  Pieter Abbeel,et al.  Variational Lossy Autoencoder , 2016, ICLR.

[34]  Phillip Isola,et al.  Contrastive Multiview Coding , 2019, ECCV.

[35]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[36]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[37]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[38]  Zeynep Akata,et al.  Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-Based Image Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Xu Ji,et al.  Invariant Information Clustering for Unsupervised Image Classification and Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[40]  Stefano Soatto,et al.  Emergence of Invariance and Disentanglement in Deep Representations , 2017, 2018 Information Theory and Applications Workshop (ITA).

[41]  Marc Alexa,et al.  How do humans sketch objects? , 2012, ACM Trans. Graph..

[42]  Martin J. Wainwright,et al.  Estimating Divergence Functionals and the Likelihood Ratio by Convex Risk Minimization , 2008, IEEE Transactions on Information Theory.

[43]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[44]  Ling Shao,et al.  Deep Sketch Hashing: Fast Free-Hand Sketch-Based Image Retrieval , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Yee Whye Teh,et al.  Probabilistic symmetry and invariant neural networks , 2019, J. Mach. Learn. Res..

[46]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[47]  Ali Razavi,et al.  Data-Efficient Image Recognition with Contrastive Predictive Coding , 2019, ICML.

[48]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[49]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[50]  Alexander A. Alemi,et al.  Deep Variational Information Bottleneck , 2017, ICLR.