Unsupervised Representation Disentanglement of Text: An Evaluation on Synthetic Datasets

To highlight the challenges of achieving representation disentanglement for text domain in an unsupervised setting, in this paper we select a representative set of successfully applied models from the image domain. We evaluate these models on 6 disentanglement metrics, as well as on downstream classification tasks and homotopy. To facilitate the evaluation, we propose two synthetic datasets with known generative factors. Our experiments highlight the existing gap in the text domain and illustrate that certain elements such as representation sparsity (as an inductive bias), or representation coupling with the decoder could impact disentanglement. To the best of our knowledge, our work is the first attempt on the intersection of unsupervised representation disentanglement and text, and provides the experimental framework and datasets for examining future developments in this direction.

[1]  Mohammad Taher Pilehvar,et al.  On the Importance of the Kullback-Leibler Divergence Term in Variational Autoencoders for Text Generation , 2019, NGT@EMNLP-IJCNLP.

[2]  Abhishek Kumar,et al.  Variational Inference of Disentangled Latent Concepts from Unlabeled Observations , 2017, ICLR.

[3]  Allyson Ettinger,et al.  Assessing Composition in Sentence Vector Representations , 2018, COLING.

[4]  Ivan Titov,et al.  Information-Theoretic Probing with Minimum Description Length , 2020, EMNLP.

[5]  John Hewitt,et al.  Designing and Interpreting Probes with Control Tasks , 2019, EMNLP.

[6]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  David Pfau,et al.  Towards a Definition of Disentangled Representations , 2018, ArXiv.

[9]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[10]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[11]  Lei Li,et al.  Generating Sentences from Disentangled Syntactic and Semantic Spaces , 2019, ACL.

[12]  Wilker Aziz,et al.  Effective Estimation of Deep Generative Language Models , 2019, ACL.

[13]  Graham Neubig,et al.  Lagging Inference Networks and Posterior Collapse in Variational Autoencoders , 2019, ICLR.

[14]  Finale Doshi-Velez,et al.  Benchmarks, Algorithms, and Metrics for Hierarchical Disentanglement , 2021, ICML.

[15]  Yee Whye Teh,et al.  Disentangling Disentanglement in Variational Autoencoders , 2018, ICML.

[16]  Rowan Hall Maudslay,et al.  Information-Theoretic Probing for Linguistic Structure , 2020, ACL.

[17]  Ali Razavi,et al.  Preventing Posterior Collapse with delta-VAEs , 2019, ICLR.

[18]  Michael C. Mozer,et al.  Learning Deep Disentangled Embeddings with the F-Statistic Loss , 2018, NeurIPS.

[19]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[20]  Ole Winther,et al.  On the Transfer of Disentangled Representations in Realistic Settings , 2020, ArXiv.

[21]  Nigel Collier,et al.  Hierarchical Sparse Variational Autoencoder for Text Encoding , 2020, ArXiv.

[22]  Lawrence Carin,et al.  Improving Disentangled Text Representation Learning with Information-Theoretic Guidance , 2020, ACL.

[23]  Yoshua Bengio,et al.  Deep Learning of Representations: Looking Forward , 2013, SLSP.

[24]  Gaurav Malhotra,et al.  The role of Disentanglement in Generalisation , 2021, ICLR.

[25]  Hareesh Bahuleyan,et al.  Polarized-VAE: Proximity Based Disentangled Representation Learning for Text Generation , 2020, EACL.

[26]  Kevin Gimpel,et al.  A Multi-Task Approach for Disentangling Syntax and Semantics in Sentence Representations , 2019, NAACL.

[27]  Guillaume Lample,et al.  What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.

[28]  Roger B. Grosse,et al.  Isolating Sources of Disentanglement in Variational Autoencoders , 2018, NeurIPS.

[29]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[30]  Christopher K. I. Williams,et al.  A Framework for the Quantitative Evaluation of Disentangled Representations , 2018, ICLR.

[31]  Tal Linzen,et al.  Targeted Syntactic Evaluation of Language Models , 2018, EMNLP.

[32]  Zhiting Hu,et al.  Improved Variational Autoencoders for Text Modeling using Dilated Convolutions , 2017, ICML.

[33]  Pascal Vincent,et al.  Do Sequence-to-sequence VAEs Learn Global Features of Sentences? , 2020, EMNLP.

[34]  Andriy Mnih,et al.  Disentangling by Factorising , 2018, ICML.

[35]  Scott T. Rickard,et al.  Comparing Measures of Sparsity , 2008, IEEE Transactions on Information Theory.

[36]  Alexander M. Rush,et al.  Avoiding Latent Variable Collapse With Generative Skip Models , 2018, AISTATS.

[37]  Ryan Cotterell,et al.  Intrinsic Probing through Dimension Selection , 2020, EMNLP.

[38]  Guillaume Desjardins,et al.  Understanding disentangling in β-VAE , 2018, ArXiv.

[39]  Alexander A. Alemi,et al.  Fixing a Broken ELBO , 2017, ICML.

[40]  Lili Mou,et al.  Disentangled Representation Learning for Non-Parallel Text Style Transfer , 2018, ACL.

[41]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[42]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..