Convolutional Conditional Neural Processes

We introduce the Convolutional Conditional Neural Process (ConvCNP), a new member of the Neural Process family that models translation equivariance in the data. Translation equivariance is an important inductive bias for many learning problems including time series modelling, spatial data, and images. The model embeds data sets into an infinite-dimensional function space as opposed to a finite-dimensional vector space. To formalize this notion, we extend the theory of neural representations of sets to include functional representations, and demonstrate that any translation-equivariant embedding can be represented using a convolutional deep set. We evaluate ConvCNPs in several settings, demonstrating that they achieve state-of-the-art performance compared to existing NPs. We demonstrate that building in translation equivariance enables zero-shot generalization to challenging, out-of-domain tasks.

[1]  Risi Kondor,et al.  On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups , 2018, ICML.

[2]  Iain Murray,et al.  Fast $\epsilon$-free Inference of Simulation Models with Bayesian Conditional Density Estimation , 2016, 1605.06376.

[3]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[4]  Michael A. Osborne,et al.  On the Limitations of Representing Functions on Sets , 2019, ICML.

[5]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[6]  Alexander J. Smola,et al.  Deep Sets , 2017, 1703.06114.

[7]  Xi Chen,et al.  PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.

[8]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[9]  Yee Whye Teh,et al.  Probabilistic symmetry and invariant neural networks , 2019, J. Mach. Learn. Res..

[10]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Tuan Anh Le,et al.  Empirical Evaluation of Neural Process Objectives , 2018 .

[12]  J. Dugundji An extension of Tietze's theorem. , 1951 .

[13]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[14]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[17]  D. Gillespie Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .

[18]  Yee Whye Teh,et al.  Set Transformer , 2018, ICML.

[19]  C A Nelson,et al.  Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.

[20]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[21]  Darren J. Wilkinson Stochastic Modelling for Systems Biology , 2006 .

[22]  Fuxin Li,et al.  PointConv: Deep Convolutional Networks on 3D Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Klamer Schutte,et al.  The Functional Neural Process , 2019, NeurIPS.

[24]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[25]  James R. Munkres,et al.  Topology; a first course , 1974 .

[26]  Kyle Boone,et al.  Avocado: Photometric Classification of Astronomical Transients with Gaussian Process Augmentation , 2019, The Astronomical Journal.

[27]  Carl-Fredrik Westin,et al.  Normalized and differential convolution , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Yee Whye Teh,et al.  Conditional Neural Processes , 2018, ICML.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Yee Whye Teh,et al.  Attentive Neural Processes , 2019, ICLR.

[31]  Y. Gal,et al.  Conditional BRUNO : A Deep Recurrent Process for Exchangeable Labelled Data , 2018 .

[32]  Gautham Narayan,et al.  The Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC): Data set , 2018, 1810.00001.

[33]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[34]  Sebastian Nowozin,et al.  Fast and Flexible Multi-Task Classification Using Conditional Neural Adaptive Processes , 2019, NeurIPS.

[35]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[36]  Sebastian Nowozin,et al.  Meta-Learning Probabilistic Inference for Prediction , 2018, ICLR.

[37]  Dustin Tran,et al.  Image Transformer , 2018, ICML.

[38]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).