Categorical Normalizing Flows via Continuous Transformations

Despite their popularity, to date, the application of normalizing flows on categorical data stays limited. The current practice of using dequantization to map discrete data to a continuous space is inapplicable as categorical data has no intrinsic order. Instead, categorical data have complex and latent relations that must be inferred, like the synonymy between words. In this paper, we investigate Categorical Normalizing Flows, that is normalizing flows for categorical data. By casting the encoding of categorical data in continuous space as a variational inference problem, we jointly optimize the continuous representation and the model likelihood. To maintain unique decoding, we learn a partitioning of the latent space by factorizing the posterior. Meanwhile, the complex relations between the categorical variables are learned by the ensuing normalizing flow, thus maintaining a close-to exact likelihood estimate and making it possible to scale up to a large number of categories. Based on Categorical Normalizing Flows, we propose GraphCNF a permutation-invariant generative model on graphs, outperforming both one-shot and autoregressive flow-based state-of-the-art on molecule generation.

[1]  Kumar Krishna Agrawal,et al.  Discrete Flows: Invertible Generative Models of Discrete Data , 2019, DGS@ICLR.

[2]  Iain Murray,et al.  Neural Spline Flows , 2019, NeurIPS.

[3]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[4]  Cao Xiao,et al.  Constrained Generation of Semantically Valid Graphs via Regularizing Variational Autoencoders , 2018, NeurIPS.

[5]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[6]  Marcus A. Brubaker,et al.  Normalizing Flows: Introduction and Ideas , 2019, ArXiv.

[7]  Emiel Hoogeboom,et al.  Integer Discrete Flows and Lossless Compression , 2019, NeurIPS.

[8]  Kevin Gimpel,et al.  Gaussian Error Linear Units (GELUs) , 2016 .

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[11]  Aviral Kumar,et al.  Graph Normalizing Flows , 2019, NeurIPS.

[12]  Emiel Hoogeboom,et al.  Learning Discrete Distributions by Dequantization , 2020, ArXiv.

[13]  Razvan Pascanu,et al.  Learning Deep Generative Models of Graphs , 2018, ICLR 2018.

[14]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[15]  Samy Bengio,et al.  Order Matters: Sequence to sequence for sets , 2015, ICLR.

[16]  Ilya Sutskever,et al.  SUBWORD LANGUAGE MODELING WITH NEURAL NETWORKS , 2011 .

[17]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[18]  Max Welling,et al.  Sylvester Normalizing Flows for Variational Inference , 2018, UAI.

[19]  Ivan Kobyzev,et al.  Normalizing Flows: An Introduction and Review of Current Methods , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Qi Liu,et al.  Constrained Graph Variational Autoencoders for Molecule Design , 2018, NeurIPS.

[21]  Richard Socher,et al.  Pointer Sentinel Mixture Models , 2016, ICLR.

[22]  Ryan G. Coleman,et al.  ZINC: A Free Tool to Discover Chemistry for Biology , 2012, J. Chem. Inf. Model..

[23]  Motoki Abe,et al.  GraphNVP: An Invertible Flow Model for Generating Molecular Graphs , 2019, ArXiv.

[24]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[26]  Timothy Baldwin,et al.  Semi-supervised User Geolocation via Graph Convolutional Networks , 2018, ACL.

[27]  Ann Bies,et al.  The Penn Treebank: Annotating Predicate Argument Structure , 1994, HLT.

[28]  Jure Leskovec,et al.  GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models , 2018, ICML.

[29]  Nikos Komodakis,et al.  GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders , 2018, ICANN.

[30]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[31]  Alexander M. Rush,et al.  Latent Normalizing Flows for Discrete Sequences , 2019, ICML.

[32]  Hugo Larochelle,et al.  RNADE: The real-valued neural autoregressive density-estimator , 2013, NIPS.

[33]  Renjie Liao,et al.  Efficient Graph Generation with Graph Recurrent Attention Networks , 2019, NeurIPS.

[34]  Olexandr Isayev,et al.  MolecularRNN: Generating realistic molecular graphs with optimized properties , 2019, ArXiv.

[35]  Weinan Zhang,et al.  GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation , 2020, ICLR.

[36]  Sungwon Kim,et al.  FloWaveNet : A Generative Flow for Raw Audio , 2018, ICML.

[37]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[38]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[39]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[40]  Pieter Abbeel,et al.  Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design , 2019, ICML.

[41]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[42]  Kaushalya Madhawa,et al.  GraphNVP: an Invertible Flow-based Model for Generating Molecular Graphs , 2019 .

[43]  Max Welling,et al.  Improving Variational Auto-Encoders using Householder Flow , 2016, ArXiv.

[44]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[45]  E. Tabak,et al.  DENSITY ESTIMATION BY DUAL ASCENT OF THE LOG-LIKELIHOOD ∗ , 2010 .

[46]  Ryan Prenger,et al.  Waveglow: A Flow-based Generative Network for Speech Synthesis , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[47]  Liyuan Liu,et al.  On the Variance of the Adaptive Learning Rate and Beyond , 2019, ICLR.

[48]  Luís C. Lamb,et al.  Graph Colouring Meets Deep Learning: Effective Graph Neural Network Models for Combinatorial Problems , 2019, 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI).

[49]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[50]  Alán Aspuru-Guzik,et al.  Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models , 2018, Frontiers in Pharmacology.

[51]  Matthias Bethge,et al.  A note on the evaluation of generative models , 2015, ICLR.