Learning Identity-Preserving Transformations on Data Manifolds

Many machine learning techniques incorporate identity-preserving transformations into their models to generalize their performance to previously unseen data. These transformations are typically selected from a set of functions that are known to maintain the identity of an input when applied (e.g., rotation, translation, flipping, and scaling). However, there are many natural variations that cannot be labeled for supervision or defined through examination of the data. As suggested by the manifold hypothesis, many of these natural variations live on or near a low-dimensional, nonlinear manifold. Several techniques represent manifold variations through a set of learned Lie group operators that define directions of motion on the manifold. However theses approaches are limited because they require transformation labels when training their models and they lack a method for determining which regions of the manifold are appropriate for applying each specific operator. We address these limitations by introducing a learning strategy that does not require transformation labels and developing a method that learns the local regions where each operator is likely to be used while preserving the identity of inputs. Experiments on MNIST and Fashion MNIST highlight our model’s ability to learn identity-preserving transformations on multi-class datasets. Additionally, we train on CelebA to showcase our model’s ability to learn semantically meaningful transformations on complex datasets in an unsupervised manner.

[1]  P. Dodwell The Lie transformation group model of visual perception , 1983, Perception & psychophysics.

[2]  P. Thomas Fletcher,et al.  The Riemannian Geometry of Deep Generative Models , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[3]  Rajesh P. N. Rao,et al.  Learning the Lie Groups of Visual Invariance , 2007, Neural Computation.

[4]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[5]  Bruno A. Olshausen,et al.  Learning transport operators for image manifolds , 2009, NIPS.

[6]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[7]  John W. Fisher,et al.  Dreaming More Data: Class-dependent Distributions over Diffeomorphisms for Learned Data Augmentation , 2015, AISTATS.

[8]  Max Welling,et al.  Learning the Irreducible Representations of Commutative Lie Groups , 2014, ICML.

[9]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[10]  Patrick Forré,et al.  Reparameterizing Distributions on Lie Groups , 2019, AISTATS.

[11]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[12]  Marissa Connor,et al.  Representing Closed Transformation Paths in Encoded Network Latent Space , 2019, AAAI.

[13]  Bernhard Schölkopf,et al.  A prior-based approximate latent Riemannian metric , 2021, AISTATS.

[14]  Yoshua Bengio,et al.  The Curse of Dimensionality for Local Kernel Machines , 2005 .

[15]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[16]  Yoshua Bengio,et al.  Non-Local Manifold Tangent Learning , 2004, NIPS.

[17]  Xueyan Jiang,et al.  Metrics for Deep Generative Models , 2017, AISTATS.

[18]  Pascal Vincent,et al.  Higher Order Contractive Auto-Encoder , 2011, ECML/PKDD.

[19]  Manuel Gil,et al.  On Rényi Divergence Measures for Continuous Alphabet Sources , 2011 .

[20]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[21]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[22]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[23]  Søren Hauberg,et al.  Fast and Robust Shortest Paths on Manifolds Learned from Data , 2019, AISTATS.

[24]  Max Welling,et al.  Spherical CNNs , 2018, ICLR.

[25]  Abhishek Kumar,et al.  Semi-supervised Learning with GANs: Manifold Invariance with Improved Inference , 2017, NIPS.

[26]  Pascal Vincent,et al.  Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[27]  Matthias Zwicker,et al.  Disentangling Factors of Variation by Mixing Them , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[29]  Bruno A. Olshausen,et al.  An Unsupervised Algorithm For Learning Lie Group Transformations , 2010, ArXiv.

[30]  Christopher J. Rozell,et al.  Variational Autoencoder with Learned Latent Structure , 2021, AISTATS.

[31]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[32]  S. Mitter,et al.  Testing the Manifold Hypothesis , 2013, 1310.0425.

[33]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[34]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[35]  Ion Stoica,et al.  Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules , 2019, ICML.

[36]  Pascal Vincent,et al.  The Manifold Tangent Classifier , 2011, NIPS.

[37]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[38]  Yan Yan,et al.  Deep Multi-Task Multi-Label CNN for Effective Facial Attribute Classification , 2020, IEEE Transactions on Affective Computing.

[39]  Soren Hauberg,et al.  Variational Autoencoders with Riemannian Brownian Motion Priors , 2020, ICML.

[40]  Philip Bachman,et al.  Learning with Pseudo-Ensembles , 2014, NIPS.

[41]  Taesup Kim,et al.  Fast AutoAugment , 2019, NeurIPS.

[42]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[43]  Wittawat Jitkrittum,et al.  Bayesian Manifold Learning: The Locally Linear Latent Variable Model (LL-LVM) , 2015, NIPS.

[44]  Rajesh P. N. Rao,et al.  Learning Lie Groups for Invariant Visual Perception , 1998, NIPS.

[45]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[46]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[47]  Serge J. Belongie,et al.  Learning to Traverse Image Manifolds , 2006, NIPS.

[48]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[49]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  W. Hoffman The Lie algebra of visual perception , 1966 .

[51]  Lars Kai Hansen,et al.  Latent Space Oddity: on the Curvature of Deep Generative Models , 2017, ICLR.

[52]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[53]  Jiebo Luo,et al.  Exploring Explicit Domain Supervision for Latent Space Disentanglement in Unpaired Image-to-Image Translation , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Strategies From Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.