论文信息 - Latent Normalizing Flows for Discrete Sequences

Latent Normalizing Flows for Discrete Sequences

Normalizing flows are a powerful class of generative models for continuous random variables, showing both strong model flexibility and the potential for non-autoregressive generation. These benefits are also desired when modeling discrete random variables such as text, but directly applying normalizing flows to discrete sequences poses significant additional challenges. We propose a VAE-based generative model which jointly learns a normalizing flow-based distribution in the latent space and a stochastic mapping to an observed discrete space. In this setting, we find that it is crucial for the flow-based distribution to be highly multimodal. To capture this property, we propose several normalizing flow architectures to maximize model flexibility. Experiments consider common discrete sequence tasks of character-level language modeling and polyphonic music generation. Our results indicate that an autoregressive flow-based model can match the performance of a comparable autoregressive baseline, and a non-autoregressive flow-based model can improve generation speed with a penalty to performance.

Alexander M. Rush | Zachary M. Ziegler

[1] Xi Chen,et al. PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.

[2] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[3] Jason Lee,et al. Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement , 2018, EMNLP.

[4] Noah Constant,et al. Character-Level Language Modeling with Deeper Self-Attention , 2018, AAAI.

[5] Aaron C. Courville,et al. Recurrent Batch Normalization , 2016, ICLR.

[6] Alexandre Lacoste,et al. Neural Autoregressive Flows , 2018, ICML.

[7] Richard E. Turner,et al. Neural Adaptive Sequential Monte Carlo , 2015, NIPS.

[8] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[9] E. Nering,et al. Linear Algebra and Matrix Theory , 1964 .

[10] Zhiting Hu,et al. Improved Variational Autoencoders for Text Modeling using Dilated Convolutions , 2017, ICML.

[11] E. Tabak,et al. DENSITY ESTIMATION BY DUAL ASCENT OF THE LOG-LIKELIHOOD ∗ , 2010 .