An Unsupervised Algorithm For Learning Lie Group Transformations

We present several theoretical contributions which allow Lie groups to be fit to high dimensional datasets. Transformation operators are represented in their eigen-basis, reducing the computational complexity of parameter estimation to that of training a linear transformation model. A transformation specific "blurring" operator is introduced that allows inference to escape local minima via a smoothing of the transformation space. A penalty on traversed manifold distance is added which encourages the discovery of sparse, minimal distance, transformations between states. Both learning and inference are demonstrated using these methods for the full set of affine transformations on natural image patches. Transformation operators are then trained on natural video sequences. It is shown that the learned video transformations provide a better description of inter-frame differences than the standard motion model based on rigid translation.

[1]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[2]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[3]  Nuno Vasconcelos,et al.  Multiresolution Tangent Distance for Affine-invariant Classification , 1997, NIPS.

[4]  E. Rolls,et al.  INVARIANT FACE AND OBJECT RECOGNITION IN THE VISUAL SYSTEM , 1997, Progress in Neurobiology.

[5]  J. V. van Hateren,et al.  Independent component filters of natural images compared with simple cells in primary visual cortex , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[6]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[7]  Rajesh P. N. Rao,et al.  Learning Lie Groups for Invariant Visual Perception , 1998, NIPS.

[8]  Bruno A. Olshausen,et al.  PROBABILISTIC FRAMEWORK FOR THE ADAPTATION AND COMPARISON OF IMAGE CODES , 1999 .

[9]  E. A. Repetto,et al.  The computation of the exponential and logarithmic mappings and their first and second linearizations , 2001 .

[10]  Christoph Kayser,et al.  Learning the invariance properties of complex cells from their responses to natural stimuli , 2002, The European journal of neuroscience.

[11]  David W. Arathorn,et al.  Map-Seeking Circuits in Visual Cognition: A Computational Mechanism for Biological and Machine Vision , 2002 .

[12]  Ajay Luthra,et al.  Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[13]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[14]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[15]  Michael S. Lewicki,et al.  Efficient auditory coding , 2006, Nature.

[16]  M. A. Repucci,et al.  Responses of V1 neurons to two-dimensional hermite functions. , 2006, Journal of neurophysiology.

[17]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Bruno A. Olshausen,et al.  Bilinear models of natural images , 2007, Electronic Imaging.

[19]  Rajesh P. N. Rao,et al.  Learning the Lie Groups of Visual Invariance , 2007, Neural Computation.

[20]  Bruno A. Olshausen,et al.  Learning Transformational Invariants from Natural Movies , 2008, NIPS.

[21]  Pascal Frossard,et al.  Minimum Distance between Pattern Transformation Manifolds: Algorithm and Applications , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Bruno A. Olshausen,et al.  Learning transport operators for image manifolds , 2009, NIPS.

[23]  Geoffrey E. Hinton,et al.  Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines , 2010, Neural Computation.

[24]  Bruno A. Olshausen,et al.  Lie Group Transformation Models for Predictive Video Coding , 2011, 2011 Data Compression Conference.

[25]  Hossein Mobahi,et al.  Seeing through the blur , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.