Feature-Matching Auto-Encoders

We examine how learning from unaligned data can improve both the data efficiency of supervised tasks as well as enable alignments without any supervision. For example, consider unsupervised machine translation: the input is two corpora of English and French, and the task is to translate from one language to the other but without any pairs of English and French sentences. To address this, we develop feature matching auto-encoders (FMAEs). FMAEs ensure that the marginal distribution of feature layers is preserved across forward and inverse mappings between domains. FMAEs achieve state of the art for semi-supervised neural machine translation with significant BLEU score differences of up to 5.7 and 6.3 over traditional supervised models. Furthermore, on English-to-German, FMAEs outperform last year’s best models such as ByteNet [8] while using only half as many supervised examples.

[1]  Oriol Vinyals,et al.  Towards Principled Unsupervised Learning , 2015, ArXiv.

[2]  Masahiro Suzuki,et al.  Joint Multimodal Learning with Deep Generative Models , 2016, ICLR.

[3]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[4]  Kevin Murphy,et al.  Generative Models of Visually Grounded Imagination , 2017, ICLR.

[5]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[6]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[7]  Brendan J. Frey,et al.  Continuous Sigmoidal Belief Networks Trained using Slice Sampling , 1996, NIPS.

[8]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[9]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[10]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[11]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[12]  Demis Hassabis,et al.  SCAN: Learning Abstract Hierarchical Compositional Visual Concepts , 2017, ArXiv.

[13]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[15]  Alex Graves,et al.  Neural Machine Translation in Linear Time , 2016, ArXiv.

[16]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[17]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[18]  Honglak Lee,et al.  Deep Variational Canonical Correlation Analysis , 2016, ArXiv.